Digital PDFs
Documents
Guest
Register
Log In
EK-8600S-MM-004
October 1987
498 pages
Original
178MB
view
download
OCR Version
75MB
view
download
Document:
VAX 8600/8650 System Fault Isolation Manual
Order Number:
EK-8600S-MM
Revision:
004
Pages:
498
Original Filename:
OCR Text
EK-8600S-MM-004 VAX 8600/8650 System Fault Isolotidn Manual Prepared by Educational Services of Digital Equipment Corporation lst Edition, October 2nd Edition, December 3rd Edition, April 4th Edition, November 1985 1985 1987 1987 (© Digital Equipment Corporation 1985, 1987 All The material informational change without Rights Reserved. in this manual is for purposes and is subject to notice. Digital Equipment responsibility for in this manual. Corporation assumes no any errors that may appear Printed in U.S.A. The manuscript for this book was created on a VAX-11/780 system wusing RUNOFF. The book was produced by Educational Services Development and Publishing in Marlboro, MA. The following are trademarks of Digital Equipment Corporation. diilglitlall PDP DEC DECmate DECUS DECwriter P/0S Professional Rainbow RSTS UNIBUS VAX VMS vT DIBOL RSX Work MASSBUS | RT Processor T, & CONTENTS CHA?TER 1 System Fault Isolation Overview o s o o o o s+ o o o o Fault Detection and Reporting Overview . « « ¢ « ¢ ¢ ¢ o Error Detection Networks . « « « o o ; c o o o o o o e INtroduction . MbOX Error MbOX INtErruptsS « o« o ¢ REPOrting o o o o o o o . « o« o o o o o o o o o o o o o s o o ¢ o o o ¢ o o o o o o o o o o o o o o Ebox Interrupt and Exception Arbitration Logic . . « « « =« o« - Error Handling Microcode . « ¢ o o o o o o o o o .', e o o VMS Machine Check Handler . « « ¢« ¢ ¢ o o o o o o o o o o SPEAR (Standard Package for Error Analysis and Reporting) SPEAR BAS1IC « o ¢ o o o o o o o s o o o o o o o o o o o o " SPEAR Extended . . « ¢ o ¢ ¢ ¢ o ¢ s o o o o o o o o v o o & » * » L - » L] L4 L] » . - L L » » » Q L Keep Alive Fail CHAPTER 2 Manual Machine Check Stack Frame Analysis Overview et e e e e e e e e e e e e e e e e e e e e e 2=-2 Standard EHM, Console, and VMS Actions . . ¢« ¢ ¢ o o o o & 2-4 VMS Error Rate Thresholds o o o o o o o o o o Sample CTY Output From A Fatal Bug Check . . « « « ¢ o « o 2-7 . . . . . 2-8 . .V. e o o 2-9 . « o« o« ¢ o Abbreviated Machine Check Stack Frame Worksheet Machine Check Stack Frame Analysis Flow Chart iii 4 & @ . * L » L L4 & L L & L ¢ *® . L L4 L ] & » . L . . L4 L4 L L L4 L * . L N s & L4 * ] . * & L * L & . L L & L & & ® . L . * . » L *® . * L4 » L - * & * - o & + & Cache Data Parity Errors « ] MBoOXx . & M-09 Errors ] o ECC L Array L MBoOXx » L M-08 LJ L R MBox L . CPR (CCC) Parity Error . . M-06 MBox Detected CPU Write Parity Error M~-07 MBox Detected ABus Parity Errors M-05 L o . . & o - . - o - . s o o L o . ¢« - . * EXrrors = . L] « * . L NXM . ® CP 10 Buffer Error . L MBox MBoOX Error Tag Parity Errors . M-03 M-04 MBox Cache L4 TB Parity @ Control Store Parity Error MBox & MBox M-01 M-02 e M-00 L4 Error Scenarios & MBox M“lo MBOX CaChe W Bit Parity Etror » . M-11 MBox Detected ABus Bad Data Code . .« « ¢ » « » ¢ » o » o » o 2-43 2-44 . 2-50 L & L L4 L4 @ 2-46 @ & L4 L] L L @ & .« . . & L L . L * s . & - L4 L * L @ s s = & * & & & €€ 8 & 8 & e » & & | ¥ & & & S & & & & €& ® e » & ® & & ® ® & & & e ‘e @ & & ® » & & @ . @ s (e s & 8 8 & o 8 & & ® e & & & e & & & e . e . @ . @ 8 & & = & s s & & & @ & o & s & . e . @ . = s MCF RAM Parity Error & EBOXx & E-16 e Error (A WBUS) . .« . Operand Parity Error (B RAM) E-14 EBox Micro Stack Parity Error . . E-~15 EBox Control Store Parity Error . s EBOX @ EBoOXx operand E-13 o » EBoOx OPBus E-12 . e EBOX & OPBus Parity Error (EMD Data) . Parlty Error (String Data) . EBox OPBus Parity Error (IMD Data) EBox OPBus Parity Error (ID Data) E-11 EBOX Operand Parity Error (A RAM) E-07 E-08 E-09 E-10 . ® . & + 8 « * « & . €& (B WBUS) & Error & Parity Error (VMQSAV) ¢ Result Parity Error (VMQ Shift Operati on) EBOX EBOX Operand EBoOx Operand ® E-05 E-06 e E-02 E-03 E-04 s EBOX e EBOX E-01 e WBUS Parity EXror . « o« o« o o o o o Result Parity Error (EDP Misc Error) . . . . . EBOX Result Parlty Error (VMQ) =« « « o EBOX Result Parity Errors (WReg) - E-00 © EBox Error Scenarios 2-79 L4 L & [ L4 » & & & L4 & & s s L4 L & & e L & & 5 & & & # & o s & €& & s & & & = ® * & & ® * & . e = . & & s 8 5 & & s & & & s & & s & & & ® s 8 s . & . 2-80 2-83 s PARITY IAMux GPR Parity Error RLog Parity Error . . I-06 IBoOXx IBuffer Parity Error . I-07 IBoOXx IBMux Parity Error . . e IAMux WBus = IBoOx IBoOX IBOX & I-03 I-04 I-05 s Control Store Parity Error IDRAM Parity Error . . & IBOx 5 IBox I-02 & I-01 & IBox Error Scenarios 2-85 2-87 2-88 2-89 2-91 L4 @ & *® % s & . - L L e o * & « & » » [ @ o & * o @ 0 o o * e & ® * s o & & & e o ¢ e & @ o o & e o o o ‘;‘1; . & s ?011 LI T o B¢ FDRAM Parity Error . FBox FBA CS Parity Error FBoOX FBM Control Store Par s . & . . ¢ FBOX . o F-03 F-04 F-05 Error o GPR Parity Error e Self Test FBox cr FBoOX F-02 M F-01 o FBox Error Scenarios L » L » » » . » 2-92 2-94 2-95 2-97 2-100 CHAPTER 3 Manual SBIA Stack Frame Analysis » L » L4 L L » . . . » L - » ABus Address/Data PE on CPU Reference ABus Control PE on CPU Reference . . Address Error on CPU Reference . . . . + « « o o ¢ ¢ o o A-04 SBIA Detected State Machine PE - A-05 SBIA Detected ABus Data PE on DMA Read Data . . . A-06 SBIA Detected ABus CTRL PE on DMA Read Data (MSK/STAT) A-01 SBIA Detected A-02 A-03 SBIA Detected SBIA Detected A-07 SBIA Detected A-08 SBIA A-09 DMA EXrors .« o o DMA Interlock Timeout . ¢« ¢ ¢« ¢ o o o o o o SBIA Detected SBI Timeout Error on CPU Reference o . iv « o« o o o o o o o o » Error Scenarios ® SBIA and SBI L & L . . * Ooverview » » » 3-10 3-12 3-13 3-14 3-16 3-17 « « « @ & « + .+ & & » » » » » L 3-20 3-22 3-24 3-26 3-28 3-30 Console Message Descriptions CHAPTER 4 OverView L4 » L] L » Console Message » » » » » . » | - » L] L] L L] » » - ® Format « « o« o ¢ o ¢ ¢ ¢ o o o o o o Message Descriptions (CSM) Console Support Microcode Fatal,uessages o o (DCN) General General General General - L L ] Macro Control Macro Control - (MCP) L Hexadecimal Debugger Warning Messages L (HEX) & Environmental Environmental L (EMM) (EMM) . L Error Correction Routine Error Messages . L4 (ECR) Diagnostic Console warning Messages LJ . .« o Diagnostic Console Error Messages . . Diagnostic Console Fatal Messages Diagnostic Console Information Mesgages . . 4-18 4-20 Monitor Module Error Messages Monitor Module Fatal Messages (KAF Snap Shot Routine) Message L & & [] - L & L4 * L4 MCPSNP.LST * (KAF) . . & Macro Control Program Information Messages Macro Control Program Warning Messages . * (MCP) L . . - .« .« ] « . 4-23 & . & . . . - Program Error Messages Program Fatal Messages 4-22 L (MCP) (MCP) & & L4 L (DCP) (DCP) (DCP) (DCP) * (DCN) L (DCN) Console Error Messages . . . Console Fatal Messages . . . Console Information Messages Console wWarning Messages . . "‘ (DCN) & Console CHAPTER » » & o o * « « « . . « o o - o ¢ * . . « ¢ ¢ L . . . o ¢ L4 Unexpected Read Data Fault A-13 SBI Interlock Sequence Fault . A-14 SBI A-15 SBI Multiple Transmitter Fault o « L o ¢« L - o « . ¢ . L4 Parity Fault « « ¢ ¢« Write Sequence Fault & SBI SBI A-11 A-12 L s SBIA Detected SBI Error Confirmation on CPU Reference A-10 4-24 4-26 4-27 4-28 4-30 VMS System Event Files 5 OverView . 5-2 System Event File Entry Type pefinitions . . . . . . 5-2 ANALYZE/ERROR LOG: « o o o o s o o o o o o o o o o & 5-3 SPEAR 5-9 » . » TM L . » . L . L] . » . » . L . L . » . » . . L L . » . . TM . TM . » L » L L L » L L « ¢« ¢« o & o o o & . ¢ c ¢ . & = o "Examples of Translated Entry Types (Entry Type 002) Machine (Entry Type Soft ECC 006) (Entry Type 013) Check Error . KA86/KA865 SBIA Error v c o o o o o 5-11 5-13 5-15 L [ & L L L L & . L & - & L » L & © & ® & & & & ‘e & & s & & @ & & w & » * » & L - L ANALYZE/ERROR LOG/SUMMARIZE=(DEVICE) Report Format ANALYZE/ERROR LOG/SUMMARIZE=(VOLUME) Report Format Report Format ANALYZE/ERROR LOG/SUMMARIZE=(ENTRY) ANALYZE/ERROR_LOG/SUMMARIZE=(HISTOGRAM) Report Format #* . ® . - L L Formats ANALYZE/ERROR_LOG/LOG Report Format . . . . & ANALYZE/ERROR LOG/REGISTER DUMP Report Format ANALYZE/ERROR LOG/STATISTICS Report Format . . Appendix A Error Handling Microcode Flow Chart and Notes L L » Appendix B Console CS/DRAM Correction Flow Chart and Notes . . Appendix VAX 8600/8650 System Control Block . VMS Machine Check Handler (MCHK) Flow Chart C Appendix D 5-17 5-18 5-19 5-19 5-20 5-22 5-23 5-25 5-26 L L L L L * * » * [ & L L L » = L .- L * » L & & L - L4 # & * = . L L L4 o & ¢ * « * « « « * . L ANALYZE/ERRORLOG Report Entry | Unknown » & o L | » . ® . - » » . . « + & . Device Timeout . « « o« o o = Asynchronous Device Attentio n & COld Start ¢ o Cragh Rfi”"start System Bugcheck . Type L4 032) 037) (Entry Type 040) (Entry Type 096) (Entry Type 098) (Entry Type 273) Type - L Type 11/780 Environmental Monitor KA86/KA865 Processor Halt . . . 017) KA86/KA865 Console Reboot 015) 016) * Type . Type * (Entry (Entry (Entry (Entry (Entry L L » L » » L * » » 5-27 5-27 5-27 5-28 5-29 5-30 5-31 Keep Alive Fail Snap File Description and Flow Chart Introduct ion VAX8600/8650 SNAPSHOT File » * & 8 & B Keep Alive Flow FOrmat « ¢« o o B . « 8 ¢ ¢ o ® « o % ¢ " o &« o & o » - » D-1 » o o o o o o Format . . « « Header Format . o« o Record o " D-3 o D-5 SNAPSHOT Master Header Standard SNAPSHOT Record SNAPSHOT Record Header for FBA and FBM SDB-Vis Channel Standard SDB-Visibility Data Format (@xcluding FBA and lelv SNAPSHOT Record Format for SNAPSHOT Record Format for EMM Registers SNAPSHOT Record Format for EBox Scratch Pad Contents SNAPSHOT Record Format for IPR Registers SNAPSHOT Record Format for PAMM Console Registers vi L4 » L » . . D-6 ¢ . . . . . » » » ¢ . o . » o o o o o D-9 - D-10 . . D-11 » » D-12 * D-12 D-14 » » » D-14 SNAPSHOT Record Format for 64 Longwords on Interrupt Stack . D-15 « . D-16 SNAPSHOT Record Format for Clock Alignment and uPC Trace . . D-18 Clock State Table « « « . « . . . SNAPSHOT Record Format for ABus Adapter . ¢ ¢ ¢« ¢ ¢ « o o o o s o o o o« o o o« « D19 « ¢ ¢ o o oo o o o o o o o o o o o o o D=21 VSR File FOrmat =« Key SDB Signals . « « « ¢ « o o SHOW SNAP File Format . . « ¢ o o o o o o o o o o o s o o o D~27> Key SDB Error Signals . . . ; O PO . . '* " . N .‘. e« o« o« o« o o D-38 . . . . 8 e . . . Appendix E EBox Error Arbitration Network . . « « «« « . . . EHM Trap Vector Addresses and Relative Priorities Appendix F « . E-1 . . . E-2 EBox Error Detection Networks - AMux and BMux Parity Generation . « « o« - Operand Parity Error Detection Network - Result Parity Error Detection Network . o . . ¢ . . o . . ¢ . . o« . . o F=2 . F-3 . F-4 - WBus Parity Error Detection Network . . « « - Micro-Stack Parity Error Detection Network ¢« . ¢ . « . « . « F=5 . F=5 « ¢« « o« « o« G=2 IBox - Instruction Buffer Parity Error Detection Network . . H-2 . o ¢ Appendix G FBox Error Detection Networks . . . ‘Appendix H IBox Error Detection Networks ¢« . ¢ o« - AMux Parity Error Detection Network . . « « « - BMux Parity Error Detection Network . . .« « - AMux Error Code Generation . . . « « ¢ o ¢ ¢ - AMux WBus Data Generation . « « « o« ¢ o o o« o - Control Store Parity Error Detection Network - Dispatch RAM Parity Error Detection Network . - OPBus Longword Parity Generation . . . ¢« « ¢ - Rlog Parity Error Detection Network . « « ¢« ¢ « « « o« . . ¢ ¢« o « « o o . . « ¢« o« « « o+ o . . « « o F-4 . o o« o« . . o o H=3 H-3 H-4 H-4 H=5 H-6 H=7 H=7 and Interrupt Generation vii » L4 . » » & . - * » L HHH!THHHH MBox - ECC Error Detection Network . « « « o« o & . MBox - Cache Tag Parity Error Detection Network . . MBox - Cache Tag W Bit Parity Error Detection Network L & L2 . . = & & « & o .« ¢ e « @ . . « MBox - Internal Error Sum 28 Generation . . . . ~ MBox - Internal Error Sum 58 Generation . . . . MBox - Multiple Error and Held Error Generation . . @ Error Reg Full, MBox - Cycle Error Sum Generation & MBox - Fatal Error, & MBox Error Detection Networks & Appendix I o« NS SNovnonnt e W . « IBox IBox IBox IBOX IBox IBox IBox IBox - Late Parity Error Last Cycle ¢ . . «& EBox EBox EBox EBox EBox EBox . D""44 MBox - Cache Data Correction, Cache Data Parity Error . . . and Byte Write CP Error detection Network MBox - CP Write & Write Data Parity Error Detection Network MBox - Write Byte and Byte Parity Error Detection Network . MBox - CP NXM Error Detection Network . . . . « ¢« « « « . . MBox MBox MBox MBox MBox MBox MBox MBox MBox MBox - CP Buffer Error Detection Network . . . « ¢« ¢ o« o « « Control Store Error Parity Detection Network . . . . CPR Parity Error Detection Network . . . ¢« &« ¢ ¢ o o ABus Address Parity Error Detection Network . . . . . Abus Control and Mask Parity Error Detection Network ABus Data Parity Error Detection Network . . . . . . ABus Bad Data Error Detection Network . . . « « « « . Array Read Data Error Handling Flow Chart . . . . . . Cache Read Data Error Handling Flow Chart . . . . . . Cache Writeback Error Handling Flow Chart . . . . . . Appendix J SBIA SBIA SBIA SBIA SBIA SBIA SBIA SBIA Error Detection BYTCNT - EHMSTS EVMQSAV EBCS Machine Check Stack Bit Descriptions Stack Frame Byte Count . ¢« ¢« EBox Control EDPSR EBox Data CSLINT IBESR Console and Interrupt Status IBox Error Status Register . and I-11 I-12 1I-13 I-14 I-14 I-15 I-16 I-17 . . . . J=-2 J=2 J-3 J-3 . . J-4 J-5 . J-6 Frame Organization and ¢ ¢ ¢ ¢« o o o o o« Error Handling Microcode Status Word . . . « EBox Virtual Adr/Multiplier Quotient Save Reg Path I-8 I-9 I-10 I-10 I-11 Networks - DMAI Transaction Buffer Error Lock Circuit . . . . - DMAA Transaction Buffer Error Lock Circuit . . . . - DMAB Transaction Buffer Error Lock Circuit . . . . - DMAC Transaction Buffer Error Lock Circuit . . . . - Fault and Local Error Detection Network . . .. . . - Multiple CPU Error and CPU Buf Error Circuit . . . - Control and Address Parity Error Detection Network Appendix K I-8 Status Status Register . . o K=2 o . K-3 K-8 K-9 . . . . . . . . . Register « « +. . . . o . o o Register . K-13 . . K-16 o« K-19 o« o K=22 EBOX « o« WOrd ¢« ¢« o 1 . o« EBXWDZ o ¢ ¢ ¢ o o o EBOx o o o Word o o o 2 o o » IVASAV VIBASAV . » » » » » . » . * » » Virtual Address Save Register . . . « virtual IBuffer Address Save Register . TM " . TM K""’"22 ¢« . ¢« . « . « . « . K-23 K-23 K-24 EBXWDI1 ESASAV ISASAV CPC MSTAT1 EBox Starting Address Save IBox Starting Address Save Register . . Register . . . . . . . . . . K-24 « ¢ o o o« o « « » o« « « « o« « o« « « K=25 K=26 K=30 Current Program Counter MBox Status Register 1 . MBox Status Register 2 . . .« . MDECC MBOX Data MERG MBox « o Error o« o o o o o o o « o Word K-34 . ¢ o o o o o o o o & K~36 MEAR MEDR M@mory Memory Address REngt@ Data Register . .« e « ¢ o o« o o o o s o o o o« o « s o« K-42 . @ MSTAT2 FBXERR FBQI’{ CSES Console Program PC ECC Error Error ErrOr WOrd Processor EHSR CSM.STATUS Error « o . ¢ ¢« ¢« ¢ ¢ o ¢ o Regiflter Control COUNEer PSL . Generatlmn . ¢« ¢ Status . . . . . . Store Error Status . o 4 o o o o o o Longword . « « ¢ « K-41 . » . " @ K"“"’43 Word o o o . s . o . o« . « K-46 K-47 K-48 « ¢« o o « o« Handling Status Register . Console Support Mlcrocmd@ Statua o« « « . . « « « « « o K=51 . . . K-=55 viii Register Time Out AddAress SBIERR SBI ¢ * - . ] » . ] . . Maintenanc& » Silo Compare » * » . . » . . » L » * » L L4 * « o Conflguratlon Reglster Register Status and Control o . ] L-8 L-10 L-13 ix - ¥ & L L4 L] L4 » L » * . - * L L L4 & * * L L * L L * - ] 4 - L & - . L L L L - - L L ® . - L L & L4 » L » - » - - . L L4 L L - L L - . . - - * L = | - * L .« - & & & Command/Address Register . « « « « & ID Register DMAC Command/Address Register . . « « + & DMAC ID Register DMAB DMAB L4 Command/Address Reglster DMAA L DMAA L4 Diagnostic Control and Status DMAI Command/Address Regiater I . . DMAI ID Register - o L o - o L ¢ - s L « - « L « » SUmmary - Error L4 L Errflr L-2 L-3 L-5 - DMAICA DMAIID DMAACA DMAAID DMABCA DMABID DMACCA DMACID o L DIAGCS ¢ - CSR ERRSUM ¢« . SILOCOMP SBISTS CR . L TOADR MAINT Descriptions Stack Frame Organization SBIA L Appendix L L-14 L-15 L-21 L-24 L-25 L-24 L-25 L-24 L-25 L-24 L-25 PREFACE The purpose of this manual in 1isolating elusive or kernel is to assist the field service engineer intermittent faults in the VAX 8600/8650 system. Revision 3 of the Fault Isolation Manual reflects Error Handling Microcode Revision 1 with the addition of new EHM flowcharts and description, and the addition of new bits to the EBCS, EDPSR, EHMSTS, and 1IBESR registers. Revision 4 reflects the new Machine Check Handler. | : The keep alive flowchart was corrected to reflect console and of software, the snap file description was corrected to reflect the addition 79 longwords to the snapshot. Chapter 1 provides an overview of how the 8600/8650 detects processes errors. This is the "big picture". It explains that: e The error including detection the SBIA, networks report in each of the and Dboxes, to the EBox. '@ The EBox arbitrates and prioritizes generates a micro trap vector address. @ The Error Handling Microcode (EHM) captures the state of the machine, builds a stack frame, <clears the error condition, rolls back the PCs, pushes the stack frame onto the interrupt stack, and calls the VMS machine check handler. the errors and | @ VMS processes the error, queues the stack to be written e Spear reads the event file and, depending which function is selected, computes system availability, summarizes the contents of the eventfile, translates specific entries, or analyzes the cause of specific entries. ERRLOG.SYS, and determines whether to REI, or bugcheck. to Chapter 2 consists of a flowchart and a catalogue of error scenarios for the EBox, FBox, manually analyze stack scenario. a typical IBox, frames. and MBox. The flowcharts are used to The result is a pointer to an error The scenario describes the nature of the error, error signature, and suggests a probable provides cause. Chapter 3 contains error scenarios for the SBIA and SBI errors. will help the field engineer analyze cause of SBIA error entries. xi and identify It the most probable Chapter 4 contains a series of messages. tables 1listing | the console error | ‘Chapter 5 describes the system event file and how to entries in the SPEAR. (ERF) or translate the system event file using the error record formatter There are 12 appendices: A. Appendix A contains a flowchart of the Error Handling Microcode. It shows, 1in detail, how the EHM builds the stack, clears the error, rolls back the PCs, pushes the information on the interrupt stack, and calls VMS. The flow is detailed enough to allow the field engineer to use it to troubleshoot double error conditions. B. Appendix B contains a brief description that describes parity errors. C. how the console Appendix C contains a flowchart of the handler. It shows the and ‘a corrects 1logging and VMS the flowchart control store machine check REI/Bugcheck decision making process. ‘ D. Appendix D contains flows describing the and console ECC correction process. generation of snap files. E. Appendix E contains detailed functional diagrams that describe the EBox error arbitration, prioritizing, and trap vector generation logic. F. Appendix F contains a set of functional diagrams describe each error detection network in the EBox. G. Appendix G is the same pertain to the FBox. as Appendix E except the diagrams H. Appendix H is the same as App@fidix E except the diagrams pertain to the IBox. keep alive It includes fail the that except the diagrams same as&Appendix~ E e MBox. the the endnixtoIis Apptai I. per J. Appendix J is the same as Appendix E K. Appendix K contains make up the machine L. pertain to the SBIA. | except the diagrama' bit definitions for the registers check stack frame. that Appendix L contains bit definitions for the registers that make up the SBIA stack frame xii CHAPTER 1 SYSTEM FAULT ISOLATION OVERVIEW SYSTEM FAULT ISOLATION OVERVIEW INTRODUCTION Both the VAX8600 and the VAX8650 are classified as a fault tolerant Processors. ‘That is, they are designed to recover from most intermittent errors with minimum impact on system performance. The following examples illustrate the degree of error detection and recovery capability of these Processors. o The Memory Array contents are ECC protected. If a single bit error is detected during an Array Read, the data is corrected (on the fly) before it is cached. Thus future references will produce o S Cache data is byte parity protected with ECC backup. If a Cache data parity error is detected the MBox will execute a Cache Data Correction Cycle. Thus when the operation is retried o correct data. the data will be correct. Extensive parity protection has been designed into the processor data paths. In most cases if a data path parity error is detected, the processor 1is stalled, the error condition cleared, and the PCs and RLog rolled back to a state prior to the occurrence of the error. Thus, when the processor is restarted the instruction will be re-executed successfully. FAULT o Control Store and Dispatch RAMs are ECC protected. If a Control Store or Dispatch RAM parity error is detected the processor is stalled and the Console 1is interrupted. The Console will read the bad RAM word, look up the ECC for that word in a table, perform ECC correction, write the word back into the RAM and restart the processor. Thus, in most cases, the process will successfully continue. o If a short-term power failure occurs (10 minutes or less) the system provides battery back up power to the Array Refresh Circuitry. Thus when main power is restored the system will successfully continue. DETECTION AND REPORTING OVERVIEW Figure 1-1 describes the overall organization of error detection, error handling, and error reporting for both the VAX8600 and VAX8650 Processors. It shows the relationship between the Error Detection Networks, the Interrupt and Exception Arbitration Logic, the Error Handling Microcode, the VMS Machine Check Handler, the System Event File, and SPEAR. In addition, Figure 1-1 shows that the Console is involved in RAM correction and Keep Alive Fail detection. The following paragraphs provide a brief description of each 1-2 function. | | W31SAS SYSTEM FAULT ISOLATION OVERVIEW W31SAS IN3A3 SWA JNVHS . —~NOI1D34¥0D 21907}« X08W OI1D3130JN ~ HOMLAN | TVYNH-ILNI {WH3) AAIVLSJ3IHD WN3 AHOMILIN NOI103130 , TYNH3LX3 HOH 3 1-3 X084 40H 3 SYSTEM FAULT ISOLATION OVERVIEW o e ERROR DETECTION NETWORKS detection As shown in Figure 1-1 each box has a separate error for error These error detection networks constantly monitor network. y directl conditions, and with the exception of the SBIA, report errors pt to the EBox Interrupt and Exception Arbitration Logic. The Interru and Exception Arbitration Logic in turn generates a micro trap vector | into the Error Handling Microcode (EHM). The SBIA is a special case because it must monitor for externally as well as internally detected errors. Externally detected errors are errors that are detected by either a Nexus or an I/0 device. The SBIA uses the external interrupt mechanism to report these errors to the EBox Interrupt Arbitration Logic which in turn reports them to the EBox Interrupt and Exception Arbitration Logic. Internally detected SBIA error are errors that are detected internal s the MBox which in turn, to the SBIA. The SBIA reports these errorto reports them directly to the EBox Interrupt and Exception Arbitration | | Logic. MBOX ERROR REPORTING Before describing the function of the EBox Interrupt and Exception Arbitration Logic, it 1is worth mentioning that the MBox has two methods of reporting errors: o Through the Port Status Lines, and o Through a special MBox Internal Interrupt Mechanism. During an MBox Port The Port Status Lines are used in two ways. as well as errors Buffer on Translati report to Request they are used Violation, TB Miss, Access other non-error related port status (e.g., When the MBox is not processing port requests the port status etc.). lines are used to report MBox Fatal Errors (FE) and MBox Error | , Register Full conditions. MBox FE - The majority of MBox Fatal Errors are non-recoverable errors For this reason they share the that require immediate service. In highest error handling priority with EBox detected errors. contrast, MBox Error Reg Full Trap requests have a lower priority. They are serviced at Instruction Register Decode (IRD) time. of the MBox Error Reg Full Trap is MBox Error Reg Full - The purpose EBox Error Handling Microcode that the in routine special a to tqo trap is designed to read, save, and release the MBox Error Address Register (MEAR) quickly. Saving and releasing MEAR quickly is important because it is possible for the MBox to detecta second error before the first error is completely processed by the Error Handling Microcode. Should this happen the MBox will need MEAR to latch the physical address of the second error in order to perform Array and Cache data correction. 1-4 SYSTEM FAULT ISOLATION OVERVIEW MBOX INTERRUPTS The MBox uses an Internal Interrupt mechanism to report all other MBox Like MBox Error Reg Full Traps, MBox errors. non-fatal Interrupts are serviced at Instruction Register Decode (IRD) time and However, because the RLog therefore have a relatively low priority. and PCs can be unwound and the operation retried, most of these errors will be recoverable. EBOX INTERRUPT and EXCEPTION ARBITRATION LOGIC This is a central focal point. All errors, with the exception of MBox Control Store and EBox Control Store parity errors report to the EBox Interrupt and Exception Arbitration Logic. This logic prioritizes the errors and generates a Micro-trap Vector Address. The Micro-trap Vector Address is the starting address of a special EBox Microcode routine that is designed specifically to handle error conditions. ERROR HANDLING MICROCODE The special Micro routine mentioned above is referred to as the Error Handling Microcode (EHM). The Error Handling Microcode reads the state of 25 major CPU Control and Status Registers and puts the result in the EBox Scratch Pad RAM; locations 17 through 2F. referred to as the Machine Check Stack Frame. This data is In the process of building a Machine Check Stack Frame the Error Handling Microcode also clears the error condition and rolls back the RLog and PCs. This is done in preparation for retry of the failed operation. Finally the Error Handling Microcode sets up the VMS Machine Check Vector Address (SCBB+4) and calls the Interrupt/Exception Micro-routine. The Interrupt/Exception Micro-routine (not shown in Figure 1-1) pushes the Machine Check Stack Frame onto the Interrupt Stack and calls the VMS Machine Check Handler. VMS MACHINE CHECK HANDLER The VMS Machine Check Handler pops the Machine Check Stack Frame off the Interrupt Stack and puts it in a System Event (memory) Buffer. It then notifies the ERRFMT Process which, in turn, appends the buffer to the System Event File (ERRLOG.SYS). In the process of building and queuing the System Event Buffer, the Machine Check Handler also checks the error rate and the severity of the error. If the error rate is excessive, or if the error condition is severe (i.e., non recoverable) then the Machine Check Handler will Bugcheck the process (i.e., the User or the System). Otherwise the Machine Check Handler will execute a REI and the system will retry the failed operation. | SYSTEM FAULT ISOLATION OVERVIEW SPEAR (Standard Package for Error Analysis and Reporting) SPEAR is a maintenance tool specifically designed to help Engineers sort and analyze the contents of System Event Files. are two versions of Spear: SPEAR Basic and SPEAR Extended. Field There | SPEAR Basic ~ SPEAR Basic is available on all sites that have a DEC Maintenance Contract. This version of SPEAR consists of five programs: o Instruct - Instruct is a Computer Based Instructional Program that 1is designed to help new uses learn to use the SPEAR Library Programs. In addition to explaining the SPEAR Programs, Instruct also describes the organization of System Event Files and includes a review of some of the most common Troubleshooting Approaches. | o Compute - Compute is designed to use the contents of the System Event File to calculate system availability and effectiveness. The Compute report can be used (in part) to determine if the system 1is approaching the pmlnt where corrective maintenance will soon be required. o Summarize - Summarize is designed to summarize the contents of the System Event File. The Summarize report can be used to determine whether the CPU or one of the I/0 subsystems needs further investigation. o0 Retrieve - Retrieve 1is a bit-to-text translator. It is designed to extract specific entries from the System Event File and produce either a Brief or Full translation of the event. This information can be used to investigate the cause of CPU and I/0 failures. o VSR (Venus Snap File Report Builder) - This program was added to the Basic SPEAR Library specifically to support VAX 8600/8650 Systems. Like Retrieve, VSR 1is a bit-to-text translator. VSR, however, 1is designed to translate SNAP Files. A SNAP file is a file built by the Console as a result of a Keep Alive Fail condition. Both SNAP Files and the Keep Alive Fail mechanism are described later. SPEAR Extended ' SPEAR Extended is only available at Remote Diagnosis Centers. In addition to the programs that are available in SPEAR Basic, Spear Extended includes: 0 Analyze - Analyze is designed to analyze the contents of large System Event Files and identify the most probable cause of certain failures. in a System Event Theories. displayed Basically Analyze evaluates the events File against a set of If-Then Isolation If the Events support a theory then the theory is for consideration by the Engineer at the RD center. 1-6 SYSTEM FAULT ISOLATION OVERVIEW o VSA (Venus Snap File Analysis Builder) - VSA Analyze except it System SNAP Files. 1is designed 1is similar to to analyze the contents of KEEP ALIVE FAIL The CPU has two Keep Alive Fail (KAF) Mechanisms. One KAF mechanism by the Console to monitor the operation of the CPU. Each time is used the EBox enters Instruction Register Decode (IRD) Time a signal is generated that sets a status bit in the Console. The Console checks the state of this bit every 300 milliseconds. If the bit is set, the The Console then clears the bit and system is operating normally. continues operation. However, if the console finds that the EBox has failed to set the bit, it declares a Keep Alive Fail Condition. (i.e., the EBox failed to enter IRD Time during the last 300 Milliseconds.) At this point the CPU is considered dead. As a result of the KAF condition the Console enters a special KAF if either of Routine. The main purpose of the routine is to determine to store e availabl are ) SNAP2.DAT or T (SNAP1.DA Files the SNAP state of the read will Routine KAF zthe are, they If Data. SNAPSHOT the CPU and put the result in a SNAP File buffer. The routine begins by reading each SDB Visibility Channel. The routine then goes on to read the Console Control and Status registers, the EMM Control and Status Registers, the EBox Scratch Pad RAM contents, the Internal Processor Registers, the PAMM contents, the last 64 ‘longwords pushed on the Interrupt Stack, and the Control and Status Registers for SBIAO | and SBIAl. When the KAF routine is finished building the buffer it verifies the contents of all loadable RAMs and then writes the SNAP File buffer out to the RL0O2. Finally, the KAF routine will attempt to re-boot the system. I1f the re-boot is successful the SNAP File will be copied to the VMS and named ERRSNP.LOG;n If, however, Directory "SYS$SYSROOT:[SYSERR] the console was unable to re-boot the system then a second KAF condition will occur which will result in the generation of a second SNAP File (SNAP2.DAT). At this point the Console will loop waiting for manual intervention. The other Keep Alive Fail mechanism is used by VMS to monitor the The Console is programmed to update the operation of the Console. every 10 milliseconds. Every 90 Register (BTOY) Year Of Time Buffered the Time Of Day Register (TODR). via register this reads VMS seconds If, If it finds that the register has been updated then all 1is OK. however, VMS determines that the register has not been updated it gakeslan entry in the System Event File and attempts to restart the onsole. 1-7 CHAPTER 2 MANUAL MACHINE CHECK STACK FRAME ANALYSIS MANUAL MACHINE CHECK STACK FRAME ANALYSIS OVERVIEW This Chapter is designed to help you analyze the contents of Machine Check Stack Frames. It assumes that you have a "valid" Machine Check Stack Frame in hand, and that you want to see which error scenario this method of analysis will lead to. HOW TO USE THIS CHAPTER - First translate the EBox Control and Status Register (EBCS). Then, based on the contents of EBCS, follow the flow chart illustrated in Figure 2-2. The flow chart will eventually lead you to a page number that describes a typical error scenario that would produce a Stack Frame similar to the one you are analyzing. Review the scenario. It describes the most common conditions under which that type of error will occur. It also suggests the most probable cause of the error; first at the module level and than at the component FRU level. (E.g., RAM and MCA callout.) NOTE In addition to the flow chart, Tables 1 through 4 may be used as a quick reference to the Error scenarios. Table 1 Index of MBox Error Scenarios Index Error Condition Page M-00 M=-01 M-02 M-03 M-04 M-05 MBox MBox MBox MBox MBox MBox Control Store Parity Error TB Parity Error Cache Tag Parity Errors NXM Errors CP IO Buffer Error CPR (CCC) Parity Error 2-15 2-19 2-21 2-23 2-25 2-27 M-06 M-07 MBox Detected CPU Write Parity Errors MBox Detected ABus Parity Errors 2-29 2-31 M-08 M-09 M-10 M-11 MBox MBox MBox MBox Array ECC Errors Cache Data Parity Errors Cache W Bit Parity Error Detected ABus Bad Data Code 2-33 2-36 2-43 2—-44 - MANUAL MACHINE CHECK STACK FRAME ANALYSIS Index of EBox Error Scenarios Table 2 Page | Index Error Condition E-00 E-01 E-02 E-03 E-04 E-05 EBox WBus Barity Error EBox Result Parity Error (EDP Misc Error) EBox Result Parity Error (VMQ) | EBox Result Parity Errors (WReg) EBox Result Parity Error (VMQ Shift Operation) EBox Operand Parity Error (VMQSAV) E-06 E-07 E-08 E-09 E-10 EBox Operand Error (B WBus) EBox OPBus Parity Error (EMD Data) EBox OPBus Parity Error (String Data) EBox OPBus Parity Error (IMD Data) EBox OPBus Parity Error (ID Data) E-11 E-12 E-13 E-14 E-15 E-16 EBox Operand Parity Error (A RAM) EBox Operand Error (A WBus) EBox Operand Parity Error (B RAM) EBox Micro Stack Parity Error EBox Control Store Parity Error EBox MCF RAM Parity Error 2-46 2-48 2-50 2-52 2-55 2-56 2-57 2-59 2-61 2-63 2-65 2-66 2-68 2-70 2-72 2-73 2-79 Index of IBox Error Scenarios Table 3 Page Index Error Condition I-01 I-02 I-03 I1-04 I-05 IBox Control Store Parity Error IBox IDRAM Parity Error IBox IAMux WBus PARITY IBox IAMux GPR Parity Error IBox RLog Parity Error 2-80 2-83 2-85 2-87 2-88 I-06 I-07 IBox IBuffer Parity Error IBox IBMux Parity Error 2-89 2-91 Table 4 Index F-01 F-02 F-03 F-04 F-05 Index of FBox Error Scenarios Error Condition FBox Self Test Error | FBox GPR Parity Error FBox FDRAM Parity Error FBox FBA CS Parity Error FBox FBM Control Store Parity Error Page ~ | 2-92 2-94 2-95 2-97 2-100 MANUAL MACHINE CHECK STACK FRAME ANALYSIS STANDARD EHM, CSL, and VMS ACTIONS A lot of effort has gone into standardizing the way the Error Handling the Console, and the VMS Machine Check Handler respond to Microcode, CPU detected errors. This Section describes the standard role played in the individual Error Special cases will be explained by each. : | ' | Scenarios. EHM ACTION (Standard): Unless specified otherwise in Error Scenarios the Error Handling Microcode will: IBox references. the individual ‘ 1. Stop | 2. Check and set the EHM ENTERED doublem@rrmr trap m@ahanism* 3. 4. Capture the’state of the CPU and build'a'Stack Frame. Clear the error condition and, if necessary cmrrectrGPR PEs. 5. Roll back the PCs in preparation far an inatruction‘retry. 6. Load the VMQ with SCBB+4 (the MCHK Handler starting address). 7. Check and set the VMS ENTERED doublewérror trap mechanism, EHM ENTERED. 8. Clear 9, Call the Exception and Interrupt Handler (microcode). The Interrupt and Exception Handler will: l. Raise the IPL to 1F. 2. Push the Stack Frame onto the Interrupt Stack (memory). 3. Enable 4, Start VMS at IBox References. the MCHK Handler (SCBB+4) . NOTE For specific details refer to Appendix A) the EHM Flows. (See MANUAL MACHINE CHECK STACK FRAME ANALYSIS VMS ACTION (Standard): Unless specified otherwise in Error Scenarios the VMS Machine Check Handler will: individual the 1. Enter the Reason Code in the copy of EHMSTS <03:00>. 2. Check the error rate 3. Check for 4. Transfer the Stack Frame to an Event Buffer. 5. Setup the buffer to : ERRLOG.SYS. SROOT: [SYSERR] SYSSSY 6. Check for abort bits. 7. Either REI or Bugcheck approprate. Fatal (See Table 5). Errors | and set the ' Abort MCHK be bit appended 1if to (Non-Fatal/User or Fatal/System. If VMS decides to Bugcheck the User it will set up an Exception on the User Stack and execute an REI. If, however, VMS decides to Bugcheck the System it will print the contents of the Interrupt Stack on the Console, write the contents of main memory, the two Error Log Buffers, called SYS$SYSTEM:SYSDUMP.DMP and the reason for the crash to a file and then Halt. Fail condition. This will result in the Console detecting a Keep Alive | The Console will Snap Shot the state of the system and attempt ¢to reboot VMS. If the reboot process is successful VMS will copy the file to SYSSSYSROOT: [SYSERR]ERRSNAP.LOG. If the file already exists the version number will be bumped by one. Also during the re-boot process the two Event Buffers saved during the Crash will be appended to the System Event file (SYS$SYSROOT:[SYSERR]ERRLOG.SYS). | NOTE For specific details refer to the VMS MCK Flows. Appendix B) Table 5 VMS Error Rate Thresholds Error Excessive Rate and Action Taken EBox 3 Errors within 100 Milliseconds - Bugcheck ‘IBox 3 Errors within 100 Milliseconds - Bugcheck MBox (FE) 2 Errors within TB MBox (See 20 Milliseconds - Bugcheck 3 Errors within 100 Milliseconds - Fatal System Bugcheck (1D) | 3 (Non-Array/Non-Cache) Milliseconds - Bugcheck MBox Errors within 100 . MANUAL MACHINE CHECK STACK FRAME ANALYSIS Table 5 Error VMS Error Rate Thresholds Exaessxve Rate and Action Taken Cache | FBox | 3 Errots‘(Same Cacha) within 100 Mllll@@canda - Turn off offending cache and print message on CTY 3 Errors within 100 v Milliseconds - Turn off and print message on CTY. CONSOLE ACTION (RAM ECC Correction): perform as (cont.) Whan | FBox called (interrupted) to Control Store or Dispatch RAM correction the console proceeds follows: , | IBox DRAM and EBox, IBox, and FBox Control Store Correction: l. Read the bad microword address (via the SDB) 2. Read the bad microword (via the SDB) 3. 4. Caigulate an ECC character for the bad microword Get the correct ECC eharactat for that miar@word S. Exclusive "or" the two ECC characters (Sae note) 6. Correct the bad bit in the microword | 7. Writé the cmrrected word out to the RAM 8. Read and verify that the error was corrected 9. Return cant?ol,to~th@ Error Handling Microcode 10. The EHM will call the CSM to read four bytaa;of,“atatus the console and put them in EScratch location T1D (CSES). FBox DRAM Correction - The Console proceeds as outlined above from but, because the FDRAM address is not latched at the time of error, the Console will reload and verify the entire RAM inat@adnmf juat a aingla microword. MBox Control Store Cmrr@ct1on - The Console proceeds as outlinad above but, because the MBox microwords' are executed before they are checked, the V Console will not attempt to, recover from an MElox Control Store Parity Error. Instead it will 1leave the system hung until a KAF - occurs. ~ NOTE If the console is unable to correct th@ RAM parity error (i.e., multiple bit error) it will declare a KAF Condition and then re-initialize the CPU. » MANUAL MACHINE SAMPLE The CTY OUTPUT following purpose is is sample to help you it is displayed on * FATAL BUG CHECK, CURRENT PROCESS REGISTER DUMP RO = R1 = R2 = R3 = R4 = R5 = R6 = R7 = R8 = R9 = R10= R1l= = the FATAL BUG AP = T7FF341FC FP SP = = 7FF341ES8 8049A794 PC = PSL= 802FE3A2 041F0008 8049A79C 8049A7A0 8049A7A4 8049A7A8 8049A7AC 8049A7B0 8049A7B4 8049A7B8 8049A7BC 8049A7C0 8049A7C4 8049A7C8 8049A7CC 8049A7D0 8049A7D4 8049A7D8 8049A7DC 8049A7EQ 8049A7E4 8049A7ES8 8049A7EC 8049A7F0 8049A7F4 8049A7F8 8049A7FC STARTUP STACK 00000058 BYTCNT { 40001803 EHMSTS 7FFE7DFO0 "EVMQSAV2 00002000 EBCS (4 00000000 EDPSR ¥ 00200001 CSLINTb 01006000 IBESR T 00000000 EBXWD1$% 84006004 00004F10 00060000 04000100 00000003 0000007C 0000001F FFFFFFFF FFFFFFFF 7FF6C3D6 00C00000 FRAME ANALYSIS CHECK VERSION = V4.1 MACHINECHK, 00000FEQ 7FFE643C 7FF6C3DA 7JFF6C3D6 7FF6C3D6 7FF6C3D6 STACK output from a Fatal-Bug-Check CTY Dumg‘ Its locate Machine Check register information when in the Kernel/Interrupt Stack. 7FFE6440 7FFE6300 0000000F 7FF6CO9F 7FFE640C 7FFE64B4 00000000 7TFF4C748 7FFEDO052 7FFED25A 7FFEDDD4 7FFE33DC KERNEL/INTERRUPT |R FROM A CHECK EBXWD29 IVASAV (D VIBASAV// ESASAV (7 ISASAV 3 CPC 1Y, MSTATI {; MSTAT2)(, MDECC 1 MERG | MEAR 19 CSHCTL !4 ! MEDR FBXERRVY CSES1% pC 11 PSLL4 Machine check while ... 81383 34 3434 #0uE3Y 8 sR P S0hAIDEE 2 |wivaowpssiows|ign0 |smov] nYs 0'84 ! 5 T T AR‘ RIS kel X P - o 4ey ’ ) STACK FRAME SH34|n8i 30 2 100} 0101 0011 bELL usda3 )83 §2 ¥ dJdD 3119M a3y a3 CHECK 81383 ~—ii1~Iii“,ii!i‘*1ii~“|ii|{i~”iIii~iIii|~miiIi~ d*~it0iIQi1r‘YyiiiYoyiilo|5yona“~|stem7=1li§i2Y]i|%57ii£7~h»iA’yiiI|”&7ii13isogyii!i[3o2nd¥se“g—e]~~s&eT1gy8ii|i#.y¥§i1&|i;4v¥RiL{1¥N35[~~**“51,ii{i300¥3i1¥ii{i£¢iI{1§fig*~~_~i£iiIIiRz1T3ii80FiiiRii!80¢~~«gOuPw}3§!1i|Er%)g0D31'iiI4Mi1d04*iiTi3¥5M~~m£i{Twou0sl|HI]¥I0iYyiIon~*~ V1V(NOILOFHYOI 0Ll Snav 1 1434 10 i TR T $1347 3 i i , S 5 9H1V3318D04JHIVI sSNav 0010 1010 0110 LHLO d9 1 11434 LLOL 81 31940 ani MACHINE $383 sb4t3_g)0Ysit1RSeGS4vMKoT|OO)I1M10g3oE47NJEIwH&*0iR80a3whv4xemwBe]L{E4L1EES3YL8O0EEN12hx_|iadaa8t£#9t4nbiE%2rwB5»0uae1gd*I|L14£Y*k3048T43[¥|i[-§Z%¥xog!x~p£a1g£5"224.3l=oiwieg4s4245ary3[4A|l|82vdimX¥ittvH0Ymdl8Hw4OH41O3|ILX¢qIS&NnYW1ORY%0L4MeE02C4,wOM8Q|d|y|_S£aISi6&xTU341ml0aaV[uO&EI3my81LrBRN3iiWOw0aSO4YVy110iof2Oi¥%iid7wet9mwli%9W8%4M1WxW~0io8v4s8wsy1o54wi3m|z|LYIiOauwe8nIiy¥o¥iitdwTmm¥)in08SNim1udY£e£o4vlda9|eD|ia]o1i0)38wSN3£OHyw276wM1eswdsl|0|¥J33Gw5]rIis0v3i3R5at0swr|37|iTWeS¥885i3C]soSate8Rj|i3vBWi58i5&40eDss|y!Ls3mEo4n#%wCvg#0aTmi¢ImdwlmHB53A§L3!83050iRyRN4Ii(RoAM5230¢Y]IT|1Se:¥B8003“Vai1I£4o52]AwSm||ibaa4u]00*lmtsinT|]TAEiF1W0i:A3Ts£~ O N V Y 3 d O 3 0 4 N 0 S S | n g l Nit10ENAovVomEotw3s148'§9N0HWULDB8Eo)|wF100s047|is08-m<';S XxOoM4e%imEme1A|4E'uWw0a¥%@Usne]|{'oL®0m0¥S'.'|¥v¥woLea¥4|T¥iS¥¥UieO,NStYIHO%0“LYN']£..4jTMLv'+L,1e8s@LD¥aVA@O1331Y48E'N#|1YOe1@¥iYawis°o¥yo $OuNnDosH4SYRA;43N105P4ORANge#lRo5|ECTyRw;e]YB3T41SRgMII~fse1lB$M0|MTw]fi£in0gadwm|qki40go8a3ng||mxoBnmjaeitv:7 m|T¥I0aBnY* VY 9 34 NOVEILIIM 0000 dON 0100 100 310A0 1000 310A0 3000 100LV9ISW <N4O30I4€L:YN1N8I|€L>S3G ¥S38I <8Xv0i0:v6a0H33>J14SN109S3H 103135 3 O N V Y 3 d O 3 2 8 N 0 S S I a W 3 434 naa¥07)(dL ANVHIJO394N0SSIGWI4O01 d) AVHY 1HM VUHA Sna5v.ARF VHPyYHY1HMWgs47Sym318VI%S0Aive|BA£OKENIHBUSg&LN3I0Mai8 i i A )S i ,: £ ts " " i a2 A¥o IyReS : 4YRHa)ink1w LIVISH 3<L9I2U'M6923>4 3009 MANUAL ANALYSIS "3i]¢4 2-8 £0ZS1-HW - LikT]x,®¥4 6o80i£1§b.l4t MANUAL MACHINE CHECK STACK FRAME ANALYSIS 31VISNVHL SL1SWH3 |=8¢ X084 OO j "LOW ‘OIdVYN30S XOaW O 14v1S JLVISNVHL H4ONH83 'Y=9Z:62Z 3HO8 - dJ 0/l 3d m—wm 2-10 40 viva 3d 0Z:12 » 0 Snav V1S/%SW 310N t 310AD 338 338 310N i 3 8 V 3 /8A \_LIVISW N 310N I A d2v0:L0LYM# ~03d JLVISNVHL wk«fimw& L VISW LLVASWN LHM O34 =926 Z SanWadv-—310AD L VISW * 9 1 0 2 = 3 1 0 N 1 LOW LOW 90W WINVd30 0 310N | 2\ZIV/»ISW J1VISNVHL fid«hmfi 33S L33S AVISW 3S _3s1n0a4v2_ 383S S1vND|3—YOI 3/8A LVISW \ Yi=90 /[ MANUAL MACHINE CHECK STACK FRAME ANALYSIS 33S LIVISW =g976 dD314dM LovZLoVOIlSW H86O1 MANUAL MACHINE CHECK STACK FRAME ANALYSIS T0 e ,L‘ T 38 38 P NSW/OWD) 08102 \__ E~Y . | 3d1R8M 0o=20 02 SNaV L 310N 3d 14M 4D d T4HD3¥ 31 / dw0e= \ LiVISW JLVISNYHL ZIVISW SFLNIOIVWHILOLNOID0BHT3OIWNHEHYOV0I lvJHT<YWpIH0dY:WL8SMIZ> ANHS8OTIiYLDINHOJDOL0H3NOLWNHOI0DN1MSEN0OD9'V3I4HiO3IvH0VQ}DI 1I3TH4S¥'ONHU8SO0L3DYI 380 2-11 S‘e3A4QY '90-W :0I4VYNIOS HOUY3 338 ¥31305N MANUAL MACHINE CHECK STACK FRAME ANALYSIS L 3LON - 33S vi3 €0 1= 33s 3d N\ vivd wusda3 /% X083viva 3ONVdH3O 3H1Vd MSda"3 l=§ | TM~ J ~— ~N MANUAL MACHINE CHECK STACK FRAME ANALYSIS 3d XNW8EI 335 3 3 5 { 1=sz z-gz 2anbra o~ ' aurydew £0I 33S Zol o9y 0l 43391 s yoe3ls i=¢€2 901 33s SOi 3s 38 =71 33s =61 1l3d WvHQ4 vad=0¢SO L3d ]1811 & 3S04s 3130NS1 CHECK l=20 MACHINE HH3IXES4 W84=12SOl3d H@anbrgZYzI-Xz8dauTyoerlo8yyoe3soweiySTSsATe3uy8MOTJ3aeYD3HaeIXq8)4930(9 \1w0s|4e4 1v0a|44 =81Il {3204s HY3IX84 |4_13H31O8vH1I5s8n31vuL H3IX84 MANUAL STACK FRAME ANALYSIS H IXES N HY3IX84 2-14 MANUAL MACHINE CHECK STACK M-00 MBox Control Store Parity FRAME ANALYSIS Error OVERVIEW: The MBox Control Store consists of 256 ‘"seventy-six bit" microwords. The microwords are stored in 20 256X4 RAMs on the MCC Module (MCCD-MCCG). (MCC1). These RAMs are addressed by the MMS MCA at TO At T2 of a Fetch Cycle the RAM outputs are loaded into the Microdata MCA's which cascade an accumulating parity check (over eighty bits) from one MCA to the next, culminating in MCCB PARITY D OUT H (MCCA-B) which, 1if true, latches MCC MBox CS PE H. MCC MBox CS PE H blocks further Microdata MCA clocks and also interrupts the Console via EBE CPU ERR <K2:0> Control Store. H = "7", a code which indicates a problem in the MBox Preserved within the Microdata MCA's are the failing Microword and also the micro PC of that Microword, which is "or'ed" into these MCA's at each fetch. Therefore the Console, after obtaining the contents of these MCA's via the SDB, has what it needs to attempt ECC correction. But unlike its action for errors on other Control Stores, the Console, after obtaining the syndrome and printing a message on the Console terminal, treats this error as a Keep Alive Fail Condition. A Snap Shot is taken, all RAMs are reloaded, etc. : Because some Microbits have VTERMS and other loads on the MCD and MAP Modules these Modules may cause a Microword parity error. In addition to bad Microword parity the following three conditions can cause an MBox Control Store Parity Error indication. 0 Address Parity Error wrong location. 0o UuSTACK Overflow o UuSTACK Underflow - The Microword - Too many words - was fetched were pushed on Too many words were popped £from the off the uSTACK the uSTACK An address parity bit, also contained within the Microword and included in the overall parity bit calculation, is compared by the MMS MCA against parity calculated over the uPC from which the Microword was fetched. A mismatch results in a MBox CS PE. If the three deep uSTACK also results; however which issued the fatal in MMS MCA overflows or underflows, a CS the Microword frozen will be one past the stack command. The frozen uPC must consulted to determine the uFLOW. Note that any incoming DMA COMMAND asserts MCCM DMA ERR L which uSTACK via MCCB DLY1l FORCE FF H. These three errors "or" out of visibility bit captured in the the MMS MCA as SNAP FILE. PE one be error detected on an forces a pop of the MCCl STACK ERR H, a MANUAL MACHINE CHECK STACK FRAME ANALYSIS ERROR SIGNATURE: nowonow CSESK2 :0> CSES<28:16> CSESK1 5:8> CSES<3 1> MCCl STACK n7“‘ or MBox CS CS Error Address syndrome (for SBE ='s phys. MCA bit # +1) Uncorrectable Error (Currently the Console always sets this bit for MBox CS PE) ERR H CAUSE: PROBABLE Probab111ty P AR S, o PE Components S A L0220/L0230/MCC L0204/ MCD L0205/ MAP RAMS High (see Table 1) VTerms Low Low VTerms NOTE Of the 80 MCA bits that are parity checked, 76 are RAM outputs. Syndromes, however, exist for all 88 bits in the MCA Shift Path; the extra 8 bits in the shift path hold the uPC of Table 1 Syndrome SO S R S R the frozen Microword. MBox Control Store RAM Callout ULD Phy Signal Name RAM 00 00 01 22 23 66 MCCG U NEXT ADR 0 MCCG U NEXT ADR 1 MCCF U NEXT ADR 2 MCCF U NEXT ADR 3 MCCF U NEXT ADR 4 E21 E30 E20 E36 E127 MCCF U NEXT ADR 5 MCCG U NEXT ADR 6 -MCCG U NEXT ADR 7 MCCH U ABUS MFORK EN MCCG U CLR FLGS E140 E96 E96 E106 E21 MCCG MCCG MCCG MCCG E1l3 E13 E20 E20 T 01 02 03 04 05 06 07 08 09 67 44 45 55 12 U U U U BEN BEN BEN BEN MASK MASK MASK MASK MCCG U BEN SEL 0 MCCG U BEN SEL 0 1 2 3 E10 1 E1l0 MCCG U BEN SEL 2 E10 MCCH U FORCE ABS XFR -MCCH U PA LAT LD -MCCD U SEL CACHE 2-16 E130 E31 E130 MANUAL MACHINE CHECK STACK FRAME ANALYSIS Table 1 MBox Control Store RAM Callout (cont.) Syndrome ULD Phy Signal Name RAM 47 53 2B 26 24 20 21 22 23 24 70 82 42 37 35 MCCD U FORCE LAST WD MCCH U MIC PAR MCCD U DS MUX SEL O MCCD U DS MUX SEL 1 MCCD U DSM VALID E130 E140 El E1l ES8 4B 09 3A 51 34 25 26 27 28 29 74 08 57 80 51 MCCH U ABUS INH INC -MCCH U ARY STEP EN MCCD U ARY START MCCF U ARY RD DAT EN -MCCF U ABUS LAT LD E119 E21 E104 E119 E117 29 3F 21 3D 41 30 31 32 33 34 40 61 52 60 64 E10 MCCE U ARY2 HLD E115 MCCE U ABUS MBOX OUT MCCE U ABUS ADR CNTL 0 El15 MCCE U ABUS ADR CNTL 1 E115 E104 -MCCG U ERR TRAP 0B 3F 07 0C 1F 35 36 37 38 39 10 62 06 11 30 MCCE U ABUS DR EN MCCD U MD RES EN MCCD U DO MUX O MCCD U DO MUX 1 MCCF U ECC CHECK E31 E104 E21 E22 E1l 42 15 14 4C 4E 40 41 42 43 44 65 20 19 75 77 MCCG U Al HLD MCCF U HLD ARY BUSY -MCCE U ABUS CLUP MCCE U PA MUX SEL O MCCE U PA MUX SEL 1 E117 E31 E22 E127 E127 58 4D 4A 4F 12 45 46 47 48 - 49 87 76 73 MCCE U PA MUX SEL 2 MCCF U INC WD CNT MCCE U CLR WD CNT E119 E130 E140 05 oA 16 36 56 50 51 52 53 54 04 09 21 53 85 -MCCD U CACHE WR EN MCCH U EN WR LRU MCCE U CLR CSH WV MCCE U ABUS RFL BR MCCF U MARK E30 E30 E20 E106 E140 31 13 22 28 1B 55 56 57 58 59 48 18 33 39 26 -MCCF U ARB SEL SEL MCCD U EN BYT MERGE MCCD U ABUS2 HLD MCCH CLR BD SEL MCCH IOA SEL DIS E117 E22 E8 E1l3 E13 37 2C OE 50 20 60 61 62 63 64 54 43 13 79 31 MCCE U REG LOAD 0 MCCE U REG LOAD 1 -MCCH U EN ARY BUS MCCF U CB DR EN -MCCH U ABUS DATA EN E115 E31 E22 E119 E8 78 17 MCCF U LD WD CNT MCCF U SEL WD CNT 2-17 E127 E30 MANUAL MACHINE CHECK STACK Table 1 MBox Control Store RAM Callout Syndrome ULD 65 66 67 68 69 27 06 08 (cont.) Phy Signal Name RAM 38 05 07 (NO RAM) (NO RAM) -MCCH U RFL LAT LD MCCD U CYC TYP O MCCD U CYC TYP 1 E8 E36 E36 10 3C 70 71 72 73 15 59 MCCD U CYC TYP MCCD U CYC TYP (NO RAM) (NO RAM) 39 33 40 3B 74 75 76 77 56 50 63 58 MCCG MCCG MCCF MCCE U STACK CTRL O U STACK CTRL 1 U IOA SEL O U IOA SEL 1 E106 E96 E117 E96 28 MCCE U El | 1D 32~ ~ FRAME ANALYSIS 78 79 EHM ACTION: 49 RES 2 3 E36 - E106 CORR REQ MCCH U U ADR PAR E104 None CSL ACTION: The Console displays the syndrome then declares a Keep Alive Fail Condition. VMS ACTION: Console and When VMS logs is rebooted it it. 2-18 obtains the on the Snap terminal File from and the MANUAL MACHINE CHECK STACK FRAME ANALYSIS M-01 MBox TB Parity Error OVERVIEW: The Translation Buffer is protected by four parity bits. TB Valid Bit (which reflects the PTE Valid Bit in the memory page The table) is protected by one parity bit. The PTE-(which is divided into PTE B PTE B and PTE A) is protected by two parity bits. two parts: (which includes PA <24:09>) is protected by one bit; and PTE A (which (PA <29:25>) is protected by the other. The Modify bit, and includes the Protection Code <D:A> bits are protected by the third parity bit. the TB Tag (which contains VA <30:17>) is protected by the Finally, fourth parity bit. the MBox If a TB parity error is detected during a CPU port request, the EBOX. to Error) Parity (TB "8" of return a Port Status code will Parity TB the when request Port EBox an processing If the MBox was place. take will EHM the to trap immediate an detected Error was Otherwise, the trap will be deferred until the EBox does a FORK or GET OPERAND. NOTE EBox the If an IBox Flush and Load CPC occurs before checks port status, the TB Error will not cause a trap The TB Error Status will, however, remain to the EHM. until the EBox issues an (MCF) MBox the in latched the That won't happen until MBox Clear Error Regs. (left the Thus, error. next the handle 1is called EHM over) TB Parity Error indication will show up 1in the Therefore, do not Stack Frame built by the EHM. next | be confused by such a mixture of status. ERROR SIGNATURE MSTAT1 PROBABLE (TB Valid Parity Error): TB Valid Error = <11> CAUSE: Module Probability RAMS A A A A S W S - L0205/MAP WS R A S High ERROR SIGNATURE MSTAT1 SR AN AR R A AR RN R O E5S, RN DR W U AN E60 (TB PTE B Parity Error): <10> = TB PTE B Parity Error PROBABLE CAUSE: | Module Probability RAMS L0205/MAP High E136, E113, If If E127, E82, E123, E78 EBox port: IBox port: E47, E40, E118 E34, E28 MANUAL MACHINE CHECK STACK FRAME ANALYSIS ERROR SIGNATURE MSTAT1 PROBABLE (TB <09> PTE A Parity = TB Error): PTE A Parity CAUSE: Module Probability RAMS W A B AN W . I L0205/MAP ERROR PROBABLE N N A N SN T ORI S - High SIGNATURE MSTAT1 (TB TAG <08> = GO A S E140, E136 E64 Error): Parity Error CAUSE: RAMS W WO R W L0205/MAP ACTION: R R A G N G NEEN N S High Standard (See VECTOR: 8 (EBox o . A AN N A AR B E145, E139, E135, E126, E122, E117 Introduction). clear this error Translation Buffer. EHM R E69, Probability EHM N E774, AN R G El46, Parity TB TAG O Module RN GOSN Error Request - condition In by VMQSAV contains E1l31 addition invalidating the 10 (OP Port Request - VASAV contains 18 (IBuffer Request CSL ACTION: None VMS ACTION: Standard (See 2-20 VA) the VA) - VIBASAV contains introduction) the the VA) EHM will the entire MANUAL MACHINE CHECK STACK M-02 MBox Cache Tag Parity OVERVIEW: When <12:04> of Storage for a read or FRAME ANALYSIS Errors write request is accepted by the MBox, bits four valid bits (one the physical address referenced are used to index the Tag two Data Caches. A tag consists of for each 1longword stored in the cache block), bits <28:13> of the physical address (where the longwords are stored), and a parity bit. If bits <28:13> of the address referenced match the address portion of the Tag and at least one valid bit is set, then there is at 1least a "block hit" and there may be a Cache Hit. ‘ A Tag Parity Error in one cache is ignored if there is a "hit" in the other Cache; otherwise the attempted operation is steered toward the bad Cache. If the W Bit equals a zero for that Cache, the bad Tag will be overwritten. If the W BIT equals a one, Cache Writes will be inhibited and CPU requests will be aborted. 1In all cases DMA requests are forced around Cache to the Array. A Cache Tag Parity Error on a CPU request with the tag W BIT = 0 or a Cache Tag Parity Error during a DMA request, regardless of the W BIT, results in an MBox IPL 1D interrupt. A Tag Error on a CPU request with the W BIT =1 results in an MBox FATAL ERROR microtrap in the EBox. | , NOTE Both Caches are a If both "Hit". BIT =1 Fatal is forced "looked by up" "Hit", the in parallel, checking then a Tag Error with MBox. This results in for the W an MBox Error. <06> <02> MEAR MSTAT2 EBCS <28:04> <04> <14> EBCS MSTAT1 MSTAT1 MSTAT1 MSTAT1 <15> <18> <17:16> <31:30> <29:26> oW MSTAT2 MSTAT1 wwnnwnunnu ERROR SIGNATURE Cache Tag Selected Parity Error Cache Error Address Cache Written Bit MBox Interrupt MBox Fatal Error (Note (Note 1) 1) ABus C/A Cycle (Note 2) Selected Adapter (Note 2) CPU Port (Note Cycle Type 2) (Note 2) Note 1. If 2. If ABus C/A Cycle then error occurred on a DMA reference and MSTAT1 <17:16> are relevant and MSTAT1 <29:26> should = "8", ABus Cycle, or "4", ABus Array Write Cycle. If not ABus C/A Cycle, then the error occurred a CPU reference and MSTAT1 <31:30> are relevant and MSTAT1 <29:26> should = "E", CP Read Cycle, "D", CP Write Cycle, or "3", Write Back Cycle. MBox FE then error on CPU 2-21 request with MSTAT2 <04> true. - MANUAL MACHINE CHECK STACK PROBABLE CAUSE (if Caches Module Probability L0205/MAP High B PROBABLE CAUSE Module | (if high | 0 Select): Components (RAMS) E148,E132,E128,E119,E142 E133,E129,E115,E86,E59,E63 Caches Probablllty LO205/MAP FRAME ANALYSIS 1 Select): Components (RAMS) E141,E137,E124,E114,E149 E138,E125,E120,E89,E68,E73 EHM ACTION: If both Caches are on, the EHM sweeps the cache block selected by MEAR <12:4> in the good cache and then executes a Cache Clear which clears both Cache Tags at that index. If only one Cache is on the EHM simply Cache Clear's the bad Tag. Having helped VMS to avoid stumbling across a (potentially) fatal Tag error so that the error can be logged. The EHM rolls back the Instructions, builds a Stack Frame, and vectors to the VMS Machine Check Handler via SCBB+4. EHM VECTOR: CSL ACTION: 6 None VMS ACTION: VMS builds a full Machine Check Report, puts it a buffer, queues the buffer to be appended to the System Event File (ERRLOG.SYS), and if the W BIT is equal to 1, VMS takes a System Fatal Bugcheck: some unknown process has 1lost write data. Otherwise a counter is incremented and, if three errors have occurred within 100 milliseconds, the affected cache is shut off and a message is sent to the Console reporting the event. MANUAL MACHINE CHECK STACK FRAME ANALYSIS M-03 MBox NXM Errors OVERVIEW: When the MBox honors a CPU or DMA request, the physical address referenced is latched in the PA Latch on the MAP Module. Bits <29:20> of the PA Latch are used to address the PAMM (Physical Address Memory Map RAMS) which produces a five bit code for every megabyte of physical address space. The PAMM is configured by the console during Boot when memory and I/0 space are sized. Unused megabytes of physical address space are assigned the NXM code (1F). PAMM Code Description 00-07 08-17 18-1B 1D-1E 1F Select Arrays 0-7 Reserved Select ABus Adapters Reserved Non-existent Memory (NXM) CPU requests which result in the NXM code Fatal Error microtrap is triggered are aborted and an in the EBox through vector 8. MBox DMA requests which result in a PAMM code of 1X (the most significant PAMM bit true) are also aborted and an MBox IPL 1D interrupt is requested. Note that SBIA's pass on SBI DMA requests to the MBox only when those requests pass an address check. That is, the address referenced must be within the range of Venus internal memory as indicated by bits <29:20> of the SBIA's Configuration Register which is loaded by the Console during Boot. MSTAT2 MEAR <03> ~ N ERROR SIGNATURE: NXM | Error Address Wowowowon MSTAT2 <20:16> the PAMM Code EBCS <15> MBox Fatal Error (If CPU Error) ABUS ADAPTER <1:0> (If DMA Error) MSTAT1 <17:16> MSTAT1 <29:26> Cycle Type (Note 1) - MERG <08> Memory Management Enable (Note 2) MSTAT1 <31:30> CPU Port (If CPU Error) SBIA ERRSUM «<12,08,04,00> = MBox Detected (If DMA Error) Note l. For CPU errors this should be "E", CP READ CYCLE or "D", CP WRITE CYCLE. For DMA error this should be "8", ABUS CYCLE, or "4", ABUS ARRAY WRITE CYCLE : 2. Helps to point toward CPU if O MANUAL MACHINE CHECK STACK PROBABLE CAUSE FRAME ANALYSIS (If CPU Error): Module | Probability ' L0205/MAP High L0220/L0230/MCC Low CPU Low (NOTE | 1) Note 1. PROBABLE If MERG <08> equals a zero then the CPU may have produced a bad (nonexistent) address; check MEAR. MSTAT1 <31:30> will identify the Port and Box. Even if MERG <08> equals a one the EBox makes some non virtual (physical) references which bypass equals the TB (see if MEAR equals VMQ.SAV and MSTAT1 a 2, indicating EBox Port). CAUSE (If DMA Error): Module Probability L0205/MAP L0202/SBS L0203/SBA L0220/L0230/MCC High Med Low Low EHM ACTION: EHM VECTOR: VMS ACTION: queues the Standard (See Introduction) 6 (MBox Interrupt) 8 (MBox FE) VMS builds buffer <31:30> a to full be Machine Check appended to Report, the puts it System (ERRLOG.SYS), and attempts one retry for CPU errors. fails or if two errors occur within 20 milliseconds, Bugcheck is taken. For DMA errors, the error is logged Fatal Bugcheck is taken. | a buffer, Event File If the retry a System Fatal and a System MANUAL MACHINE CHECK STACK FRAME ANALYSIS M-04 MBox CP IO Buffer Error OVERVIEWz This error occurs when an ABus adapter detects any the following errors during a CP TO I/O reference: one of o Any ABus parity error detected on a CPU's ABus transaction o An illegal adapter address o An SBI Timeout (several variations) o An SBI Error Confirmation o A State Machine Control Store parity error. The adapter returns ABus CPU BUF ERR H to the MBox which initiates MBox Fatal Error micro-trap in the EBox. an SBIA TOADR I O <23> <31:26> <22> <21> <20> <19> <18> <12> <11:10> <08> O SBIA ERRSUM SBIA ERRSUM SBIA ERRSUM SBIA ERRSUM SBIA ERRSUM SBIA ERRSUM SBIA ERRSUM SBIA SBIERR SBIA SBIERR SBIA SBIERR O <15> <14> TR EBCS EBCS (T MSTAT2 <02> wowonwouwu ERROR SIGNATURE: CP I0 Buffer Error MBox Fatal Error MBox Interrupt CPU Buffer Error Lock (Note 1) CPU Command/Length CPU Address/Data Parity Error (Note 2) CPU Control Parity Error (Note 2) Address Error (Note 2) Error on CPU Command/Address (Note 3) State Machine Parity Error (Note 2) SBI Timeout (Note 2) CP Timeout Status <1:0> (timeout type) CP SBI Error Confirmation (Note 2) CPU (Longword) ABus Address ~ Note 1. Locks CPU error status for all CPU errors except ERRSUM <18>. 2. Different errors. 3. Sets for errors detected on the CPU's Command/Length/Address, 4. MEAR, and MSTAT2 <20:16> (PAMM CODE) are not valid for these errors. Also, if the Machine Check has not been processed by the VMS Machine Check Handler then MSTAT1 <17:16> (Selected For further information see the Adapter) is not wvalid. not for errors on the Data/Mask/Status. individual error write-ups. 2-25 'MANUAL MACHINE CHECK STACK FRAME ANALYSIS EHM ACTION: The EHM rolls back the Instructions, builds a Machine Check Stack Frame, and vectors to the VMS Machine Check Handler via SCBB+4. See EHM flows for more detail. EHM VECTOR: 6 VMS ACTION: VMS builds a full Machine Check Report, puts it 1in a buffer, queues the buffer to be appended to the System Event File (ERRLOG.SYS), and increments a counter: if two occur within 20 milliseconds or the retry fails (same PC), it takes a System Fatal Bugcheck. EHM VECTOR: VMS 8 attempts (MBox retry FE) for the first error only if certain conditions are met; otherwise it returns an error to the current process, which, if the processor mode is Kernel or Exec, will result in a System Fatal Bugcheck. For further information see the individual error write-ups. MANUAL MACHINE CHECK STACK FRAME ANALYSIS M-05 MBox CPR (CCC) Parity Error a modified form OVERVIEW: When the MBox accepts a CPU request (MCF), The resulting RAMS. ter Parame Cycle the s of the MCF is used to addres bits) then eighteen functional CPR bits (there are also two parity help direct the resulting MBox operation. The CPR's are not used for RAMs along with the The parity bits are pre-computed and loaded intoMicrod MCA's). MCCC CPR data bits by the Console (via the SDB and onal ata and MCCC U bits U CPR PAR A H provides odd parity for ten functi CPR PAR B H provides odd parity for the other eight. that takes place in the When a CPR Parity Error occurs, the operation result s in an MBox Fatal ore theref An error MBox is unpredictable. extremely rare Under 8. vector h Error microtrap in the EBox throug upt or Error interr 1D IPL MBox by ed report an error may be conditions, Address Full Trap. If so then the EHM will set EHMSTS Abort, to serve as an Abort Flag for VMS. <K17>, o <23> <22> <15> <l14> nuw MSTAT1 MSTAT1 EBCS EBCS u ERROR SIGNATURE: CPR PE B CPR PE A | MBox Fatal Error (usually) MBox Interrupt (always) PROBABLE CAUSE (CPR B Parity Error): Module Probability L0220/L0230/MCC high RAMS E118 (4 inputs to parity tree) E139 E129 E126 E116 (2 inputs) (1 input) (1 input) (1 input) PROBABLE CAUSE (CPR A Parity Error): Module Probability L0220/L0230/MCC High RAMS E129 (3 inputs) E126 E116 E139 (3 inputs) (3 inputs) (2 inputs) Process MANUAL MACHINE CHECK STACK FRAME ANALYSIS EHM ACTION: EHM rolls back the Instructions (except when the error is reported via Error Address Full Trap) builds a Machine Check Stack Frame, and vectors to the VMS Machine Check Handler via SCBB+4. EHM VECTOR: 4 6 8 (If detected while handling an ERF Trap Request) (If detected while handling an MBox Interrupt) (If detected while handling an MBox FE Trap Request VMS ACTION: VMS builds a full Machine Check Report, puts it 1in a buffer, queues the buffer to be appended to the System Event File (ERRLOG.SYS), and takes a Fatal Bugcheck. 2-28 i MANUAL MACHINE CHECK STACK FRAME ANALYSIS M-06 MBox Detected CPU Write Parity Errors OVERVIEW: Result data destined for storage in the GPRs, Memory, or After latching the result data I/0 Space are driven onto the WBus. from the WBus, the EBox generates byte parity, which is also driven onto the WBus. The WBus parity bits are sent directly to the MBox MCD Module, while the WBus data must pass through IDP, the DBUS, IBD, and over the MDBus before reaching MCD where a parity check occurs. All four bytes are parity checked regardless of the context sent via ICB (for OP Port Writes) and EBC (for EBox Writes). The context indicates to the MBox which bytes are valid write data. If the IBox rotates the result data driven onto the MDBus, then the MCD Module (which receives a copy of the rotation count in ICB WRT ROT <1:0> H), rotates the parity bits (EBE WBUS OPAR <B3:B0> L) ¢to properly realign them with their bytes. If the EBox is doing an MBox Register write then bad parity results in an MBox results Fatal in an Error microtrap in the EBox. IPL 1D interrupt. Otherwise, bad parity If a Cache Write is performed, the parity that is sent by the CPU is written , as 1is (good or bad), in the cache parity RAMs. Bytes not written during a byte write retain their original parity. The ECC character that 1is stored with data that fails the byte parity check (including when cache is off) will be generated to indicate Bad Data. If the error occurs on a byte not written, this ECC will never be accessed because the MBox must first detect a Cache parity error before it will attempt Cache correction. NOTE The EBox checks the parity of all data that it transmits on the WBus. A mismatch (WBus PE) will result in a process abort with the process abort code equal to 2 (EBCS <19:16> = 2). <07:04> <14> <15> MSTAT1 <29:26> MEAR <29:02> wonouw MSTAT1 EBCS EBCS onou ERROR SIGNATURE Byte(s) in Error MBox Interrupt MBox Fatal Cycle Type Error Error Address (Note 1) (physical) (Note 2) NOTE 1. If MSTAT1 <29:26> equals "2" (MBox Register Write Cycle) it is a fatal error. If MSTAT1 <29:26> equal a "D" (CPU Write Cycle) then only an interrupt is requested. 2. If MSTAT1 <29:26> equal a "D", then this error may generate a Bad Data Flag error at this address. If MSTAT]1 <29:26> equals a "2" then MEAR the address of the Register written. 2-29 contains 'MANUAL MACHINE CHECK STACK FRAME ANALYSIS PROBABLE CAUSE: Probability Module SR S NG R N N S R S N LL0208/IBD L0206/1IDP L0204/MCD Low Low DBus MDBus Parity Path to MBox EHM ACTION: EHM VECTOR: Low Standard (See Introduction) 8 (MBox FE) 6 (MBox Interrupt) VMS ACTION: Standard (See Introduction). involved Bugcheck. an MBox Register Write VMS In addition, if the error will execute a System Fatal MANUAL MACHINE CHECK STACK FRAME ANALYSIS ' M-07 MBox Detected ABus Parity Errors Address/Data lines and OVERVIEW: Odd parity protects both the ABus During Command/Address cycles, the the ABus Control lines. the Control lines Address/Data lines transmit the Address while During Data cycles, the Address/Data transmit the Command/Length. lines transmit Data while the Control lines transmit the Mask/Status. ABS MCA on the MCC The MBox parity checks all ABus Control bits in the module in the the module (MCC4). It parity checks the Address onchecks MAP on the MCD Data ADB MCA and the ADA MCA's (MAP1-2). It parity Data Path the in d latche byte parity generated on the Data module: against ed compar and MCA's MCD1-3 is collapsed into longword parity . the longword parity latched from the ABus (MCD3) s, the MBox aborts th Addres oreng For parity errors on a DMA Command/L requesting adapter, the to H the operation, returns MCC ABus DMA ERROR ation, see SBIA inform r furthe and requests an IPL 1D interrupt. For | Detected DMA Errors. the write For parity errors on DMA Write Data or Mask/Status, microtrap Error Fatal MBox an but error no were completes as if there Parity Error 'is triggered in the EBox. On rare occasions if a Control a DMA Write for sed proces being is rd longwo first the is detected when on CPU (only) an MBox interrupt will be requested. For parity errors read Data or Mask/Status, the Data is passed on unaltered to the CPU but an MBox Fatal Error micro-trap is triggered in the EBox. MSTAT1 MSTAT1 MSTAT1 MSTAT1 EBCS EBCS MSTAT1 <21> <20> <19> <18> <15> <14> <17:16> MSTAT1 <31:30> MSTAT1 <29:26> MEDR <31:00> wowowowonowonnnu ERROR SIGNATURE ABus Data PE ABus Control PE ABus Address PE ABus C/A Cycle (Note 1) MBox Fatal Error (Note 2) MBox Interrupt Selected Adapter fi CPU Port Cycle Type (Note 3) ABus Data Note 1. If set, the error happened on the Command/Length or Address if MSTAT1 <20> is set, then the Command/Length not (i.e. Mask/Status was bad). 2. 3. If set, then MSTAT1l <18> is reset. If Cycle Type equals "E" (CP READ CYCLE) then MSTAT1 <31:30> indicate the port. If Cycle Type equals "8" (ABus CYCLE) or "g" (ABus Array Write) then the error was during DMA and the CPU PORT is not relevant. MACHINE 4. CHECK If MSTAT1 <20:16> PROBABLE CAUSE STACK <29:26> FRAME equals (PAMM Code) (If Data are Parity Module Probability L0202/SBS High L0203/SBA L0204/MCD ABus/Terminator High PROBABLE CAUSE (If Control | (If Parity will execute MSTAT2 Low Address Parity Error): High L0203/SBA L0205/MAP ABus/Terminator VMS and High High High L.0202/SBS VECTOR: MEAR Error): Probability EHM ACTION: both Error): Module EHM then valid. Probability L0202/SBS L0203/SBA L0220/L0230/MCC ..~ ABus/Terminator CAUSE "E", not High Low Module PROBABLE ANALYSIS High High Low Standard (See Introduction) 8 (MBox FE) 6 (MBox Interrupt) ACTION: Standard Fatal Bugcheck. (See Introduction). 2-32 VMS System ) D MANUAL MANUAL MACHINE CHECK STACK FRAME ANALYSIS M-08 MBox Array ECC Errors OVERVIEW: The ECC MCA hanging off the Array Bus on the MCD Module generates a seven bit ECC character that is stored with each longword during either an Array or Cache Write. The ECC MCA also produces a six bit syndrome as well as other error status and signals during array reads. Latched status is later gathered by the EHM into the register known as MDECC. Error signals sent onto the ERR MCA on MCC generate MCCM MBox INTR H. The ECC character generation includes an address parity bit which is calculated on PA <28:04> of the address where the longword is to be stored. The ECC MCA handles the Address Parity Bit as if it were a thirty third data bit. When the longword is later read, a parity bit generated across PA <28:04> from where the longword was fetched |is XOR'ed in for the cancel. A mismatch results in the unique "single bit" syndrome which says Address Parity Error. (This error can occur when data is read from the wrong location and later accessed.) During ECC generation the Bad Data Flag is also handled in a similar manner (i.e., The ECC MCA treats the Bad Data Flag as if it were thirty fourth data bit). The Bad Data Flag is set under the following -conditions: 8 go to o A CPU WRITE with bad parity occurs (CPU writes always o An attempt at Cache Correction fails; the data is either re-cached or, written to the Array if the operation was a Cache unless the Cache is turned off). writeback. | ' location that o A DMA Masked Write occurs to a Cache or Array o A CP Byte Write to a Cache (or Array location if the Cache is has an uncorrectable ECC error. fl turned off) which has an uncorrectable ECC Error. Write only updates specified bytes in a longword). A CP Byte Bad Data status is assumed zero when a longword is read and a mismatch results in the unique single-bit syndrome which says Bad Data. To protect against failures in the control logic which gates data onto two of the check bits are inverted by the ECC MCA the Array Bus, before being driven onto the bus. These bits are reinverted when a longword is checked. This enables the ECC MCA to detect all ones or zero's on the bus and to latch and send ECC Fatal Error status to the ERR MCA. During Refills, Array longwords are cached in parallel with being sent Array longwords are cached with their on to the read requester. If an uncorrectable error is detected during original ECC. Read the Byte Parity is inverted before it is cached. an Array MANUAL MACHINE CHECK STACK FRAME ANALYSIS For uncorrectable ECC Errors during DMA reads the ERR MCA disables the thus sends an all zero's response (bad longword and drivers This will result in SBI RDS being sent parity) to the ABu s Adapter. Also the SBIA will latch an error in the on to the request ing NEXUS. <14,10,6,2>. If EBCS <K14> 1is set, I/0 status ERRSUM Register reporting the SBI RDS should be ignored. ABus MSTAT1 MEAR MSTAT1 MSTAT?2 MSTAT?2 MSTAT2 MDECC <14> <29:26> <17:16> <31:30> <28:04> <25: 24> <20:16> <27> <27:24> <22> MDECC 21> <20> MDECC <19> MDECC <14:09> MDECC Wow o owou o onow N EBCS MSTATI1 MSTAT1 ouonononu N ERROR SIGNATURE MBox Interrupt Cycle Type (Note: 1) Selected Adapter (Note: 1) CPU Port (Note: 1) | Octaword in Error Longword in Error PAMM Code (Note: 2) Array Type Code Valid Array Type Code Bad Data Error (Note: 3) Data Single Bit Error Data Double Bit Error Data Address Parity Error Syndrome (Note: 4) - NOTE If the cycle type equals "F" then the error happened during a DMAreference and MSTAT1 <17:16> is relevant. If the cycle type equals "9" then the error happened on a CPU reference and MSTATI <31:30> is relevant. The array slot 4. PROBABLE R G AN W the error. Look for a previous error at the for the cause of this condition. same The data bit in error data or parity error) follows. and the (or bad MEAR <28:4> address corresponding syndrome is as SYNDROME IN OCTAL (MSB) (LSB) 70 07 0421000 66666655555444443333322222111111 ‘0000421 65432132654654326543265432654321 DATA BIT (MSB) BA CCCCCCC 33222222222211111111110000000000 CAUSE (LSB) DP 0654321 10987654321098765432109876543210 (Address Parity Module T associated with Error): Probability A L0200/ARRA L0220/L0230/MCC L0205/MAP L0204/MCD High High Medium Low (if (if (if same MSTAT2 <20:16> repeats) random MSTAT2 <20:16>) random MSTAT2 <20:16>) | 2-34 MANUAL MACHINE CHECK STACK FRAME ANALYSIS PROBABLE CAUSE (Double Bit Error): ‘Module Prababllxty High L0200/ARRAY L0204/MCD Array Bus/Terminator L0220/L0230/MCC Medium Low Low PROBABLE CAUSE (CRD, Corrected Read Data): Module - Probability High SMU Array Array Bus/Terminator L0220/L0230/MCC L0204/MCD | Med Low Low Low NOTE Array, above may be the L0200, L0225, L0226, or modules. An SMU would L0225 or L0235 modules. EHM ACTION: SCBB+54. Standard (See EHM will set process abort, target word (Array only Introductifin)* EHMSTS <17>, Refill) and L0235 be applicable for the SBE vector to if the error occurred on the the error was errors not detected, handled, and cleared by the Box requesting the data. VMS ACTION: For DB and BD Errors, VMS builds a full machine check report, puts it a Dbuffer, queues the buffer to be appended to the system event file (ERRLOG.SYS), and, for an error on an unmodified page, attempts to bring in a fresh copy of the page from disk (remapping it and sending the bad page to the bad page 1list). If successful, VMS REI's. Otherwise, depending on the processor mode at the time of error, and who owns the bad page, it either aborts the current process or executes a system fatal bugcheck. For parity errors VMS executes a system fatal bugcheck. For SBE errors, VMS reads MEAR, MDECC, MSTATl, and MSTAT2 from the EScratch (Machine Check Stack Frame) and stores them in a buffer. Sixteen errors are allowed to accumulate before a SBE error log entry is made. If three SBEs are detected within 10 millisecs, the SBEs are logged and SBE logging is disabled for 5 minutes. NOTE Generally, MCHK puts if an error results in a system crash, VMS the error into a VMS error buffer and takes a crash dump. The VMS error buffers are appended the dump and processed at the next reboot. 2-35 to MANUAL MACHINE CHECK STACK FRAME ANALYSIS M-09 MBox Cache Data Parity Errors OVERVIEW: Each Cache data longword is stored with four byte parity bits and a seven bit ECC Character. Cache byte parity is checked in the Data Path MCA's on MCD1-3. The the results sent to the UFO MCA (MCDU) where it is latched as status. The result is also sent to the ERR MCA MCCM) where it generates an interrupt. For errors detected during a CP Cache Read, the data and byte parity are sent to the CPU as is. For errors detected during an ABus Cache Read, an all zero's longword and parity bit are sent to the ABUS. In both cases the next operation that the MBox performs will be a Cache Data Correction Cycle. This operation will correct the entire Cache Block driving all four longwords along with their ECCs from the Check RAMs (each in turn) onto the Array Bus so that the ECC MCA can generate a correcting syndrome. Thus, when the retry occurs the data will have been corrected or else re-cached with inverted byte parity and ECC indicating Bad Data. For errors detected during a Byte Merge Write, correction is performed before the byte(s) are merged and written. The MBox stores Caché correction status in MDECC. to store Array correction status. MDECC is also used NOTE A zero syndrome can be due to a fault in a byte parity bit-or due to a transient since correction involves re~-reading the longword(s) for a second time. For further information, see Array Errors. Bad parity sent onto the CPU should result in an EBox microtrap due to an IBox Error (EBCS <13>) or an EDP PE (EBCS <9>) If such a trap does not occur the EHM will set EHM.STS <17> (Process Abort.) Bad parity sent to the ABus will result in SBI RDS being sent on to the requesting NEXUS. Also the SBIA will latch an error in the ERRSUM Register <14,10,6,2>. If EBCS <14> is set, I/0 status reporting the RDS should be MDECC <K22> MDECC <21> MDECC MDECC MDECC <20> <19> <14:09> o || <02> <28:04> <25:24)> I <17:16> MSTAT1 MEAR MSTAT1 | MSTAT1 B <03> <00> <14)> <29:26> <31:30> (N MSTAT1 MSTAT1 EBCS MSTAT1 MSTAT1 nu SIGNATURE | I (O A ERROR ignored. [ SBI Cache Read Data Parity Error (Note: 1) CP Byte Write Cache Data Parity Error (Note: MBox Interrupt Cycle Type CPU Port (Note: 2) Adapter Select (Note: Selected Cache Octaword in Error Longword in Error Bad Data Error (Note: Data Single Bit Error 2) 3) Data Double Bit Error Data Address Parity Error Syndrome 2-36 (Note: 4) 1) MANUAL MACHINE CHECK STACK FRAME ANALYSIS Note 1. Two different errors; the second only sets for errors CP Byte Write Operations, not DMA Masked Writes. 2. If the Cycle Type equals "8" (ABus Cycle) then during the error occurred on a DMA reference and MSTAT1 <17:16> are relevant. If the Cycle Type equals "E" (CPU Read Cycle), or "D" (CPU Write Cycle), or "3" (Write Back Cycle), then the error occurred on a CPU reference and MSTAT1 <31:30> are ralevant. 3.§ Look for a previous error at th@ same MEAR <28:4> which have caused the Bad Data Flag to be set. 4. Should only occur in conjunction with another an error 1in a Parity Error. Data/Parity RAM error; either or a cached Array Address T PROBABLE CAUSE (If: MSTAT1 <02> (Cache Sel) = 0 and MEAR <12> = 0) Module Probability T WO S W A S L0204/MCD WO G W NI R AR SR T S Components MWMMMWW”MW High RAMs (S@@ Tabla 1) PROBABLE CAUSE (If MSTAT1 <02> (Cache Sel) = 0 and MEAR <12> = 1): Module Probability Cmmponents AN WIS S “MM”WM”’“M“ S N W L0204/MCD R A SN A A R W W High PROBABLE CAUSE RAMs (If MSTAT1 Module Probability A S M A L0204/MCD O R W R AN e A High <02> (See Table 2) (Cache Sel) = 1 and MEAR <K12> = 0): Components A A O S RAMs R A NN A U S (See Table 3) PROBABLE CAUSE (If MSTAT1 <02> (Cache Sel) = 1 and MEAR <K12> = 1): Module Probability Campmn@nta L0204/MCD High RAMs (See Table 4) 2-37 may MANUAL MACHINE CHECK STACK | Table 1 RAM Call out FRAME ANALYSIS (MSTAT1 <02> = 0 and MEAR <12> = 0) Syndrome Bit 0 0 0 0 11 BP 0 BP 1 BP 2 BP 3 00 MCDH MCDH "MCDH MCDH 'MCDF GRP 0 BP 0 GRP 0 BP 1 GRP 0 BP 2 GRP 0 BP 3 GRP 0 DATA 00 E52 E52 E52 ES2 E11 12 13 14 15 16 01 02 03 04 05 MCDF MCDF MCDF MCDE MCDE GRP GRP GRP GRP GRP 0 0 0 0 0 DATA DATA DATA DATA DATA 01 02 03 04 05 E11 E11 E24 E24 E24 22 23 24 25 26 06 07 08 09 10 MCDE MCDE MCDD MCDD MCDD GRP GRP GRP GRP GRP 0 0 0 0 0 DATA DATA DATA DATA DATA 06 07 08 09 10 E32 E42 E11 E24 E32 32 33 34 35 36 11 12 13 14 15 MCDD MCDC MCDC MCDC MCDC GRP 0 GRP 0 GRP 0 GRP 0 GRP 0 DATA 11 DATA 12 DATA 13 DATA 14 DATA 15 E32 E32 E42 E42 E42 42 43 44 45 46 16 17 18 19 20 MCDB MCDB MCDB MCDB MCDA GRP GRP GRP GRP GRP 0 0 0 0 0 DATA DATA DATA DATA DATA 16 17 18 19 20 E113 E113 E113 E126 E126 54 55 56 52 53 21 22 23 24 25 MCDA MCDA MCDA MCD9 MCD9 GRP GRP GRP GRP GRP 0 DATA 21 0 DATA 22 0 DATA 23 0 DATA 24 0 DATA 25 E126 E138 E151 E113 E126 61 62 63 26 27 28 MCD9 MCD9 MCD8 GRP GRP GRP 0 0 0 26 27 28 E138 E138 E138 65 30 E151 64 66 Signal Name RAM DATA DATA DATA 29 MCD8 GRP 0 DATA 29 MCD8 GRP 0 DATA 30 31 MCD8 DATA 31 GRP 0 E151 E151 * -~ : TM MANUAL MACHINE CHECK STACK FRAME ANALYSIS Table 2 RAM Call out (MSTATI <02> = 0 and MEAR 12> = 1) Syndrome 0 0 0 0 11 ~ Bit BP BP BP BP 00 0 1 2 3 RAM Signal Name MCDH MCDH MCDH MCDH MCDF GRP‘O GRP 0 GRP 0 GRP 0 GRP 0 GRP GRP GRP GRP GRP BP 0 BP 1 BP 2 BP 3 DATA 00 E45 E45 E45 E45 ES5 0 0 0 0 0 DATA DATA DATA DATA DATA 01 02 03 04 05 ES5 ES5 E18 El8 E18 MCDE MCDE MCDD MCDD MCDD GRP 0 GRP 0 GRP 0 GRP 0 GRP 0 DATA DATA DATA DATA DATA 06 07 08 09 10 E28 E36 E5 E18 E28 11 12 13 14 15 MCDD MCDC MCDC MCDC MCDC GRP GRP GRP GRP GRP 0 0 0 0 0 DATA 11 DATA 12 DATA 13 DATA 14 DATA 15 E28 E28 E36 E36 E36 42 43 44 45 46 16 17 18 19 20 MCDB MCDB MCDB MCDB MCDA GRP GRP GRP GRP GRP 0 0 0 0 0 DATA DATA DATA DATA DATA 16 17 18 19 20 E107 E107 E107 E120 E120 54 55 56 52 56 61 62 63 64 65 21 22 23 24 25 26 27 28 29 30 MCDA MCDA MCDA MCD9 MCD9 MCD9 MCD9 MCD8 MCD8 MCD8 GRP GRP GRP GRP GRP GRP GRP GRP GRP GRP 0 0 0 0 0 0 0 0 0 0 DATA 21 DATA 22 DATA 23 DATA 24 DATA 25 DATA 26 DATA 27 DATA 28 DATA 29 DATA 30 E120 El32 El44 E107 E120 E132 E132 E132 E1l44 E144 66 31 MCD8 GRP 0 DATA 31 E1l44 12 13 14 15 16 01 02 03 04 05 MCDF "MCDF MCDF MCDE MCDE 22 23 24 25 26 06 07 08 09 10 32 33 34 35 36 RAM Call — Bit Signal Name N S W S S S o N SN R O and MEAR <12> RAM R N N et o ot et et T Qe 1l E53 ES53 E53 ES53 E12 bt bt et ot ot R = El2 E12 E25 E25 E25 et ot ot ot et L <02> E33 E43 E12 E25 E33 et bt ot ot ot R (MSTAT1 ANALYSIS E33 E33 E43 E43 E43 E114 E114 E114 E127 E127 P S Syndrome out FRAME E127 E139 E152 E114 E127 SR gy S 3 STACK R Table CHECK Sl e ol MACHINE E139 E139 E139 E152 E152 N MANUAL E152 2-40 = 0) MANUAL MACHINE CHECK STACK FRAM ‘Table4 RAM Call out (MSTAT1 <02> = 1 and MEAR <12> = 1) éignal Name W R S A W N R RAM e e - E46 E46 E46 E46 E6 et ped et e R AN W b et et et et R E6 E6 E19 E19 E19 bt bt ot ot ot R E29 E37 E6 E19 E29 E29 E29 E37 E37 - E37 oI Bit E108 E108 E108 El21 El21 E121 " E133 E145 E108 E121 Y ] O] T A T AU E133 E133 E133 E145 E145 P SYndrome | E145 2-41 MANUAL MACHINE CHECK STACK FRAME ANALYSIS The EHM will set Process Abort EHMSTS <17> if a non Single Bit Error occurred and the error was not detected handled, and cleared by the Box requesting VMS VMS ACTION: Standard (See Introduction). For Address Parity Errors executes a System Fatal Bugcheck. For Bad Data Errors see Array Errors. the data. MANUAL MACHINE CHECK STACK FRAME "M-10 MBox Cache W Bit Parity Error OVERVIEW: Because the processor uses two Writeback Data Caches (i.e. updating an Array) it a Cache can be updated withoutsimultaneously sets a W (Written) Bit in the Cache Tag of the selected Cache to flag when the corresponding Array Octaword is stale. This W Bit is stored Except that it causes MBox Interrupt, with its own odd parity bit. as if it were a set W Bit. That exactly E P Bit W a the MBox treats if the selected Cache Block performed be will is, a Writeback (Octaword) is about to be over written with data belong at a different Thus insuring that the most current data are always - physical address. the Array d. the W Bit had never actually been set, then preserve(If ‘Octaword will be overwritten with data identical the Writeback will be a NOP.) ~ stored there: to that R already -~ 'ERROR SIGNATURE = MSTAT2 <05> = MSTAT2 <01> <28:02> = MEAR & WBit PE Cache Select Error Address PROBABLE CAUSE (Cache 0 Select): Probability ; RAMS Module A A L0205/MAP — - - - - ~ High E77 PROBABLE CAUSE (Cache 1 Select): ‘Module L0205/MAP Probability RAMS ‘High E81 EHM ACTION: The EHM sweeps the faulty Cache Block (this is dmn@ to clear the error because the MBox may not have performed a Writeback followed by the Cache Invalidate), rolls back the instructions, builds a stack frame, and vectors to the Machine Check Handler via SCBB+4. VMS ACTION: Standard (See Introduction) 2-43 MANUAL MACHINE CHECK STACK FRAME ANALYSIS M-11 MBox Detected ABus Bad Data Code OVERVIEW: indicate A two bit ABus field, ABus LEN STAT <1:0> H, 1is wused to Length during Command/Address cycles and Status during Mask/Data cycles. The encmdings‘usad for STATUS‘arez OO?Good Data 11 Bad Data The MBox always an transmits Good Status for speed reasons, although if SBIA detects the Bad Data code accompanying DMA Read Data from the MBox it will convert the Bad Data that data back to the SBI Nexus. code into an SBI RDS to accompany For this error we must consider the reverse direction. That is, when an Adapter sends CP Read Data ( not DMA Write Data) and Bad Data Status to the MBox. At present this occurs only when an SBIA converts an SBI RDS accompanying CP Read Data into the ABus Bad Data code to continue to accompany that data to the MBox. It ought to be kept in mind that a DW780 can produce an RDS for a read access to any Unibus device capable of asserting the Unibus PB Line (this includes devices with parity protected registers). A Bad Data code status bit is latched in the ABS MCA (which will actually latch the bit for any non zero Status value). Logic external to the MCA on the MCC module also detects the error and reports it to the ERR MCA which will request an MBox IPL 1D interrupt. It will also disable the MDBus drivers and thus provide all zero's Read Data (bad byte parity) to the requesting CPU port. As a result, this error may be reported by an EBox microtrap due to an IBox Error (EBCS <K13>) or an EBox EDP PE (EBCS <09>); otherwise the error will be reported by MBox IPL 1D interrupt. EBCS <14> <14> EBCS <01> EHMSTS <17> MSTAT1 MSTAT1 <29:26> <31:30> MSTAT2 <17:16> MEDR MSTAT1 <31:30> wowow MSTAT2 N ERROR SIGNATURE ABus Bad Data Code MBox Interrupt IO Read (Note 1) Process Abort (Note 2) "E", CP Read Cycle CPU Port Selected Adapter ABus Longword (the Read CPU Port 2-44 Data) EE K FRAM IN K STAC H CHEC MANUAL MAC NOTE ‘*lgf”Can only set for RDS on reads to I/0 space wlth PA <29> equal to a l. PA <29> equals a 0 for reads ~ to SBI memory. If EHMSTS <17> equals a 1,EBCS <01> may not have been captured by the error 2. S@QNEH ACTION‘ 3;, MEAR and MSTATZ <20: 16> (PAMM Code) ara not ‘valid | for this arror.‘ PROBABL21CAUSE: Module Probab111ty SBI NEXUS High L0202/SBS L0203/SBA L0220/L0230/MCC EHM ACTION: Low Low Low If the error is reported via MBox interrupt or Error Address Full Trap, EHM sets EHMSTS <17> (Process Abort) as an Abort Flag for VMS informing it that, because the usual Emox microtrap did not occur to prevent consuming the all zero's data, tha current Regardless of reporting - instruction probably cannot be retried. builds a stack frame, and s, 1n&tructimn the - method, EHM rolls back vectors to the Maahlne Chack Handl wr v1aSCBB*4¢ VMS'ACTION: VMS builds a full Machine Check Report, puts it a buff@r, to be appended to the System Event File buffer the gueues (ERRLOG.SYS), and increments a amumt@r* if three errors occur within 100 milliseconds, VMS executes a System Fatal Bugcheck. Else, if EHMSTS <17> or EBCS <01> is set, VMS aborts the current process; Note that if the prmc@aaar mode is Kernel or otherwise it REI's. Exec, which is most likely (unless external m@mory is in use) Process Abort becomes System Fatal Bugcheck. 2-45 | MANUAL MACHINE CHECK STACK E-00 Parity Error EBox WBus OVERVIEW: onto This FRAME ANALYSIS error occurs when the EBox Data Path (EDP) drives data the WBus, and the WReg longword parity generated on EDP does not match the WBus parity generated on EBE. When EDP is driving the WBus, the byte parity from EBE is sent to EDP where it is checked against the WReg longword parity. If a WBus Parity Error occurs all copies of the GPR/SPs will have been corrupted with bad data. This will cause the EBox Error Handling Microcode to loop at location UPC 24, and the Console will detect a Keep Alive Fail The EBox WBus WBUS ERR 1is Condition. Check sent Register as WBus . is done on the EDP module, and to the EBD module where it's Parity the error signal latched in the EBCS Error. NOTE Because this type of an error will result Alive Fail Condition it will never cause generate a Machine Check Stack Frame. in the a Keep | EHM ERROR SIGNATURE: EBCS <08> = WBus Parity Error PROBABLE CAUSE: Module Probability L0209/EDP High Table 1) L0219 /EBE 'L0206/1DP L0212/FBA Medium (See Table 2) Medium Medium WBus Driver/Receiver WBus Driver/Receiver L0211/EBD Very Low EBCS Register Table 1 Byte Table Byte 3 2 1 0 ALU <31:24> 2 1 0 <23:16> <15:08> <07:00> 2 (See EDP Component Callout Bits 3 | Components EBE E102, E101 ES87, E70, E53, Component WBus El69, E169, E169, E155, PDP E86 E69 E52 Misc E3 E13 E3 E3 E3 E13 ES1 E81 Callout Latches EISS, E43, E43, E127, E43, E127, E29, E127, Parity E29 E29, E155, El E85, El 2-46 El Generators E157, ES59 E115, E59 E45, E17, E31 E31 to MANUAL MACHINE CHECK STACK FRAME ANALYSIS EHM ACTION: The EBox microtraps through vector 8, and 1loops at UPC The Console detects a Keep Alive Fail Condition, builds a Snap 24. File, sends message to console terminal indicating WBus Parity Error, No Machine Check stack frame is g@n@rat@d. and re-boots the CPU. VMS ACTION: After VMS is re-booted the Snap File is transferred to the VMS side of the system, renamed to ERRSNAP.LOG;n, and written in the SYS$SYSROOT: [SYSERR] dlreatoryw Halt (Entry (ERRLOG.SYS). Type 16) 1is In addition appended 2-47 to the an Processor System Event Error File MANUAL MACHINE E-01 CHECK EBox Result STACK Parity FRAME Error ANALYSIS (EDP Misc Error) OVERVIEW: This error occurs when ALU result data 1is passed through the VMQ MUX, but NOT loaded into the VMQ Register, and the byte parity generated at the output of the VMQ MUX does not match the byte parity of the Condition Code ALU. NOTE Two EBox Result Errors are very similar; VMQ and MISC error. The difference 1is that the VMQ is loaded during a VMQ error, while the VMQ is not loaded during a The EBox MISC error. Result Check is done on the EDP module PDP MCA, and the error signal RESULT PAR ERR is sent to the EBD module where it is latched in the EBCS Register as EBox Data Path PE. The EBox Abort flag is raised and latched into the EBCS Register on the EBD module. In the EDPSR Register, Result Parity Error and latched on the EDP module ERROR and EDP Misc error PDP MCA. bits are generated SIGNATURE: EBCS <09> = EBox EBCS EDPSR <04> <05> = = EBox Abort Flag (Note Result Parity Error Data Path EDPSR <08> = EDP Misc Error Parity (Note Error 1) 2) Note 1. The EBox Abort Flag will set if IRD LST CYC (uBEN field) is set when the error occurs. This flag indicates the error was detected too late thus, prevents an 2. EDPSR <15:12> to inhibit the PC from instruction retry. VMQ Byte in Error error. PROBABLE Probability L0209/EDP High LO0211/EBD Table Byte 3 2 1 0 1 EDP Component <31:24> <23:16> <15:08> <07:00> ~ Components (See Table 1) Very Low Bits no updated meaning for and this “ CAUSE: Module has being EBCS Register Callout ALU E102, E87, E70, E53, E101 EB86 E69 ES5 PDP Misc E3 E13 E3 E3 E13 E81 E3 E81 2-48 MANUAL MACHINE CHECK STACK F EHM ACTION: EHM VECTOR: gtandard (See Introduction) CSL ACTION: None VMS ACTION: Standard (See Introduction) 2-49 MANUAL MACHINE E-02 CHECK EBox Result OVERVIEW: This STACK Parity error FRAME Error occurs ANALYSIS (VMQ) when the result data and parity generated match the data and byte parity Condition Code ALU. VMQ register is at the output of the generated at the “ loaded and the VMQOMUX does output of | not the NOTE Two EBox Result Errors are very similar; error. The difference 1is that the during a VMQ error, a The - MISC while the VMQ is not VMQ and MISC VMQ is loaded loaded during error. EBox Result Check is done on the EDP module PDP MCA, and the error signal RESULT PAR ERR is sent to the EBD module where it is latched in the EBCS Register as EBox Data Path PE. The EBox Abort flag is raised and latched into the EBCS Register on the EBD module. 1In the EDPSR Register, Result Parity Error and EDP Misc error bits are generated and latched on the EDP module PDP MCA. ERROR SIGNATURE: EBCS <9> = EBCS <4> EDPSR <5> EDPSR = = <15:12> EBox Data Path PE EBox Abort Flag (Note Result Check VMQ Byte in 1) | Error Note 1. The EBox Abort Flag will set if IRD LST CYC (uBEN field) is set when the error occurs. This flag indicates the error was detected too late to inhibit the PC from being updated and thus, prevents an instruction retry. PROBABLE CAUSE: Module Probability L0209/EDP L0211/EBD High Very Table 1 (See EBCS Low EDP Component Table 1) Register Callout Byte Bits ALU 3 2 1 0 <31:24> <23:16> E102, E87, E70, E53, <15:08> <07:00> Components E101 E86 Eé69 E52 PDP Misc E3 E3 E3 E3 E13 E13 E81 E81 MANUAL MACHINE CHECK STACK FRAME ANALYSIS EHM ACTION: Standard (See Introduction) CSL ACTION: None VMS ACTION: Standard (See Introduction) EHM VECTOR: 8 | 2-51 MANUAL MACHINE CHECK E-03 EBox Result STACK Parity FRAME Errors ANALYSIS (WReg) OVERVIEW: This error occurs when the input parity to the WReg Mux does not match the output WReg Parity. There are three sources for WReg Mux input data and parity. There is only one error signature for all three error types. The following 1is a description of each operation: 1. WReg Shift Operation This error occurs during an EBox Shifter operation. The Shifter receives 64 bits of input data, one longword from the AMux and one longword from the BMux. However, only 32 bits are passed on to the WReg with long word parity. The parity check 1is done between the parity of the eight bytes of input data, and the longword parity of the WReg, plus the parity of the unused data in the Shifter. 2. Formatter Operation - This error occurs during an EBox Format operation. The input data to the formatter is the BMux. A check is made between the byte parity at the BMux output and the WReg 1longword parity. The only format operations that are not parity checked are F FORMAT PACK+ and F FORMAT PACK-. 3. WReg Post ALU WReg Mux 1is Shift Operation - This error occurs passing ALU result data to the WReg. parity generated compared against at the output of the WReg the Condition Code longword parity. performs a shift on the data, considered in the check. the bits shifted If the when The ALU the byte is WReg Mux in or out are The EBox Result Check is done on the EDP module PDP MCA, and the error signal RESULT PAR ERR is sent to the EBD module where it is latched in the EBCS Register as EBox Data Path PE. The EBox Abort flag is raised and latched 1into the EBCS Register on the EBD module. In the EDPSR Register, Result Parity Error and EDP Misc error bits are generated and latched on the EDP module PDP MCA. ERROR SIGNATURE: EBCS EBCS EDPSR <09> <K04> <05> = = = EDPSR <11> = WReg Parity EBox Data Path Parity EBox Abort Flag (Note Result Parity Error Error 1) Error Note 1. PROBABLE The EBox Abort Flag will set if IRD LST CYC (uBEN field) is set when the error occurs. This flag indicates the error was detecte too d late to inhibit the PC from being updated and thus, prevents an instruction retry. CAUSE: Module Probability Components LO209/EDP High (See Table 2-52 1) MANUAL MACHINE CHECK STACK L0211/EBD Very Low EBCS Register FRAM MANUAL Table Byte 3 2 1 0 EHM MACHINE 1 CHECK STACK EDP Component Bits | <31:24> <23:16> <15:08> <07:00> FRAME Callout ALU SHF E102, E87, E70, E53, EHM ACTION: VECTOR: Standard 8 CSL ACTION: None VMS ACTION: Standard ANALYSIS E101 EB86 E69 ES52 E104, E104, E104, "E104, PDP E88, E88, E88, E88, (See Introduction) (See Introduction) 2-54 E31, E31, E31, E31, ES5 ES ES5 ES E3 E3 E3 E3 Misc - E13 E13 E81 E81 MANUAL MACHINE CHECK STACK FRAME E-04 EBox Result Parity Error (VMQ Shift Operation) OVERVIEW: This @rrar occurs when the data and parity stored in VMQSAV does not matchthe new parity g@n@ratad at the output of the VMQMUX. by the VMQMUX on a previous cycle. The VMQSAV parity is generated back into in the VMQMUX for a shift routed is VMQSAV the from Data operation (left 1 or right 2). The EBox Result Check is done on the EDP module PDP MCA, and the error signal RESULT PAR ERR is sent to the EBD module where it is latched in the EBCS Register as EBox Data Path PE. The EBox Abort flag is raised and latched into the EBCS Register on the EBD module. In the EDPSR Register, Result Parity Error and EDP Misc error bits are g@narat@d ~and latched on the EDP module PDP MCAw ERROR SIGNATURE. <09> = EBox Data Path Parlty Error EBCS EBCS <04> = EBox Abort Flag (Note 1) EDPSR <05> = Result Parity Error (Note 2) Note 1. The EBox Abort Flag will set if IRD LST CYC (uBEN field) is set when the error occurs. This flag indicates the error was detected too late to inhibit the PC from being updated and thus, prevents an instruction retry. 2. EDPSR <15:12> VMQ Byte in Error has error. PROBABLE CAUSE: Module Prubability L0209/EDP L0211/EBD High Very Low Compmn@nta (S@e Table 1) EBCS Register Table 1 EDP Component Callout | Byte Bits ALU 3 <31:24> E102, E101 2 1 0 <23:16> <15:08> <07:00> E87, E70, ES53, E86 E69 E52 PDP Misc E3 E13 E3 E3 E3 E13 E81 E81 EHM ACTION: Standard (See Introduction) CSL ACTION: None VMS ACTION: Standard (See Introduction) EHM VECTOR: 8 no m@aning for this MANUAL MACHINE CHECK STACK E-05 EBox Operand OVERVIEW: This FRAME ANALYSIS Parity Error error (VMQSAV) occurs when the AMux is passing data from the VMQSAV Register to the ALU, and the byte parity stored with VMQSAV does not match the byte parity at the output of the AMux. The parity bits stored with VMQSAV are generated at the output of the VMQ MUX. NOTE A parity error in VMQSAV is indicated by a lack of any operand error bits set in EDPSR <7:0>. The EBox Operand Check is done on the EDP Module PDP MCA, and the error signal OPR PAR ERR is sent to the EBD Module where it's latched in the EBCS Register as EBox Data Path PE. The EDPSR Register bits (Operand PE and AMux Byte in Error) are latched on the EDP Module PDP MCA. | and EDPSR <02:00> <<27:24> W <09> <03> <07:05> n EDPSR EDPSR W EBCS u ERROR SIGNATURE: EBox Data Path Parity Error Operand Parity Error (AMux or BMux) "0" Indicates VMQSAV was Operand Source AMux Byte in Error PROBABLE CAUSE: Module Probability Components L0209/EDP High (See Table LO0211/EBD Very Low EBCS Register Table Byte 3 2 1 0 1 EDP Component Callout Bits ALU <31:24> <23:16> <15:08> E102, E87, E70, <07:00> E53, EHM ACTION: Standard EHM VECTOR: 8 CSL ACTION: None VMS Standard ACTION: 1) PDP Misc E101 EB86 Eé69 E3 E3 E3 E13 E13 E81 ES52 E3 E81 (See Introduction) (See Introduction) 2-56 MANUAL MACHINE CHECK STACK FRAME ANALYSIS E-06 EBox Operand Error (B WBus) OVERVIEW: This error occurs when the BMux is passing WBus data to the ALU (WBus Match), and the WBus byte parity to the BMux does not match the parity bits generated at the output of the BMux. WBus byte parity is generated by the EBox on the EBE Module. The BMux parity is | generated and checked on the EDP Module. data, WBus Match occurs in cases where GPR/SP data is read as operand and the previous cycle issued a write to the same location. When the WBus Match occurs, a bypass function takes place where the old (stale) data is not read from the SP RAMs, instead data is selected from the ‘ WBUS . WBus when a WBus Match occurs, some data may come from the some and It Depends on the context of the data being written from the GPR/SP. by the previous cycle and the context of the data being read. Therefore, it is possible to have a WBus Error and GPR/SP Error in the same cycle. | The EBox Operand Check is done on the EDP Module PDP MCA, and the error signal OPR PAR ERR is sent to the EBD Module where it's latched in the EBCS Register as EBox Data Path PE. The EDPSR Register bits (Operand PE, WBUS PE, and BMux Byte in Error) are latched on the EDP Module PDP MCA. | ERROR SIGNATURE: <09> EBCS EDPSR <03> EDPSR <07> EDPSR <31:28> PROBABLE = EBox Data Path Parity Error = Operand Parity Error (AMux or BMux) = B WBus PE (BMux) BMux Byte in Error CAUSE: Module Probability Components L0209/EDP L0219/EBE L0211/EBD High Medium Very Low (See Table 1) (See Table 2) EBCS Register Table 1 EDP Component Callout Byte Bits ALU SHF 3 2 <31:24> <23:16> <15:08> <07:00> E102, E101 EB86 E87, E104, E104, E104, E104, 1 0 E70, ES3, EG69 ES52 2-57 E88, E88, E88, E88, E31, E31, E31, E31, ES5 ES5 ES ES5 PDP Misc E3 E3 E3 E3 E13 E1l3 ES81 ES1 MANUAL MACHINE CHECK STACK FRAME ANALYSIS Table 2 EBE Component Callout Byte WBus Latches W WO <) AN RS N O OO S A S S AN VN PN Parity RO D G S N R WA NN N E155, EHM ACTION: VECTOR: O A . A N S W El Standard (See Introduction) Standard (See Introduction) EHM CSL A - AN -— - ‘ Omr=NW - ACTION: VMS ACTION: L X _ F & R & K Generators % K B B 2 K _E 8 2 5 5 J MANUAL MACHINE CHECK STACK FRAME ANALYSIS E-07 EBox OPBus Parity Error (EMD Data) This error occurs when the EBox BMux is passing OPBus data OVERVIEW: to the EBox ALUs, and the input OPBus longword parity from the IBox EBox BMux. the does not match the parity generated on the output of OPBus. the feed that There are several sources in the IBox The EMD supplies OPBus data to the EBox when data from memory. the EBox requests the EMD parity is not checked in the IBox, but parity for EMD data is passed from the MBox through the 1IBox to the EBox. The parity for EMD data checked in the EBox is created by collapsing MD Bus odd byte parity into OPBus odd longword parity. to the 1IBox IBD The Data flows from the MBox MCD module MDP MCAs, 1IOP MCAs, and then to the EBox ALU and PDP module IBF MCAs to the MCAs. | NOTE If MBox Interrupt EBCS <14> is set, the failure 1is a result of the MBox sending bad data. Therefore this Instead check for one of the error should be ignored. . following errors: MSTAT2 <14> - ABus Bad Data Code MSTAT1 <03> - Cache Data Parity Error MDECC <22> - Bad Data Error MDECC <20> - Data Double Bit Error MDECC <19> - Data Address Parity Error ERROR SIGNATURE: EBCS <09> = EDP Parity Error EDPSR <03> = Operand Parity Error EDPSR <06> = B OPBus IBESR <09:08> = "1" (uOPSEL) EBCS = "0" No MBox PROBABLE <14> (AMux PE or BMux PE) EMD Interrupt CAUSE: Module Probability Components L0208/IBD L0209/EDP L0204/MCD L0212/FBA L0211/EBD High Medium Low Very Low Very Low IBF, ALU, MCD SOpP, EBCS IOP PDP FXP, GXP Register 'MANUAL MACHINE CHECK STACK‘FRAME ANALYSIS EHM ACTION: EHM VECTOR: - CSL ACTION: VMS ACTION: Standard (See Introduction) 8 None Standard (See Introduction) 2-60 MANUAL MACHINE CHECK STACK FRAME ANALYSIS E-08 EBox OPBus Parity Error (String Data) This error occurs when the EBox BMux is passing OPBus data OVERVIEW: to the EBox ALUs, and the input OPBus longword parity from the IBox does not match the parity generated on the output of the EBox BMux. There are several sources in the IBox that feed the OPBus. The IBuffer supplies OPBus data to the EBox when the IBox is in string mode and requesting the data from memory. String Data Parity is not checked in the IBox, but the parity for string data is passed from the MBox through the 1IBox to the EBox. The parity for the string data checked in the EBox is created by collapsing MD Bus odd byte parity into OPBus longword odd parity. The IBox uses only the parity bits of the bytes that are actually requested by the EBox to create the OPBus longword parity. The remaining bytes contribute parity bits with a value of "0". The Data flows from the MBox MCD module MDP module IBF MCAs to the IOP MCAs, MCAs, to the IBox MCAs. NOTE If MBox Interrupt EBCS <14> is set, the failure 1is a result of the MBox sending bad data. Therefore this error should be ignored. Instead check for one of the following errors: | MSTAT2 <14> - ABus Bad Data Code MSTAT1 <03> - Cache Data Parity Error MDECC <22> - Bad Data Error MDECC <20> - Data Double Bit Error MDECC <19> - Data Address Parity Error ERROR SIGNATURE: EBCS <09> EDPSR <03> = EDP Parity Error = Operand Parity Error EDPSR = B OPBus IBESR EBCS <06> <09:08> = "2" <14> = "0" (uOPSEL) IBuffer No MBox Interrupt ~ 2-61 IBD and then to the EBox ALU and PDP (AMux or BMux) MANUAL MACHINE CHECK STACK PROBABLE FRAME ANALYSIS CAUSE: Module Probability DO SN A . O O S SN L0208/1IB L0209/EDP L0204/MCD L0212/FBA LO0211/EBD Medium IBF, IOP ALU, PDP Low MCD Very Low SOP, EBCS Very Low EHM ACTION: EHM VECTOR: Standard 8 CSL ACTION: None VMS Standard ACTION: Components S FXP, GXP Register (See Introduction) (See Introduction) 2-62 MANUAL MACHINE CHECK STACK FRAME ANALYSIS E-09 EBox OPBus Parity Error (IMD Data) BMux is passing OPBus data OVERVIEW: This error occurs when the EBoxlongw ord parity from the IBox to the EBox ALUs, and the input OPBus t of the EBox BMux. outpu the on does not match the parity generated the OPBus. There are several sources in the IBox that feed when the 1IBox requests the The IMD supplies OPBus data to the EBox is not checked in the IBox, but data from memory. IMD operand parity MBox through the 1IBox to the parity for IMD data is passed from the by EBox is created The parity for 1IMD data checked in the EBox. y. parit collapsing MDBus odd byte parity into OPBus longword odd to the IBox IBD The Data flows from the MBox MCD moduleandMDPthenMCAs, EBox ALU and PDP the to module IBF MCAs to the IOP MCAs, MCAs. NOTE 1. Indirect address data in IMD takes a different ~path and 1is parity checked by the IBMux parity : check logic. 2. 1If MBox Interrupt EBCS <14> is set, the failure is a result of the MBox sending bad data. Therefore this error should be ignored. one of the following errors: Instead check for MSTAT2 <14> - ABus Bad Data Code MSTAT1 <03> - Cache Data Parity Error 'MDECC <22> - Bad Data Error MDECC <20> - Data Double Bit Error MDECC <19> - Data Address Parity Error ERROR SIGNATURE: EBCS <09> = EDP Parity Error EDPSR <03> = Operand Parity Error (AMux PE or BMux PE) IBESR <10> = "1" Source IMD EDPSR <06> = B OPBus IBESR <09:08> = "3" (uOPSEL) IMD or ID see bit 10. EBCS <K14> = "0" No MBox Interrupt 2-63 MANUAL MACHINE PROBABLE CHECK STACK ANALYSIS CAUSE: Module Components L0208/1IBD High Medium Low L0209 /EDP L0204/MCD L0212/FBA L0211/EBD EHM ACTION: EHM FRAME VECTOR: IOP PDP Low MCD SOP, FXP, Very Low EBCS Register Very GXP ‘Standard (See Introduction) 8 CSL ACTION: None VMS Standard ACTION: IBF, ALU, (See Introduction) 2-64 MANUAL MACHINE CHECK STACK FRAME ANALYSIS E-10 EBox OPBus Parity Error (ID Data) OVERVIEW: This error occurs when the EBox BMux is passing OPBus data to the EBox ALUs, and the input OPBus longword parity from the IBox does not match the parity generated on the output of the EBox BMux. There are several sources in the IBox that feed the OPBus. Parity for operands supplied to the EBox from the IBox GPRs and Instruction Buffer go to the EBox via the ID Latch to the OPBus. The parity supplied to the EBox for this data 1is created by collapsing IAMux or IBMux odd byte parity into OPBus longword odd parity. | Data flows from the IAMux or IBMux Latch on the IDP Module IAD MCA, to the VA Latch and ID Latch on the IDP Module IVA and UPK MCA, and then over the OPBus to the EBox EDP Module ALU and PDP MCAs. ERROR SIGNATURE: EBCS EDPSR EDPSR IBESR IBESR <09> <03> = EDP Parity Error = Operand Parity Error <06> = B OPBus <10> <09:08> = PROBABLE Source ID (uOPSEL) IMD or ID see CAUSE: Modules Probability Components L0206/IDP High Medium IAD, ALU, IVA PDP L0206/1IDP Low UPK, DPP L0212/FBA L0211/EBD Very Very SOP, EBCS FXP, GXP Register L0209/EDP ) "0" "3" (AMux Low Low EHM ACTION: EHM VECTOR: Standard 8 CSL ACTION: None VMS ACTION: Standard (See Introduction) (See Introduction) PE or BMux bit 10, PE) MANUAL MACHINE CHECK STACK FRAME ANALYSIS E-11 EBox Operand Parity Error OVERVIEW: the ALU, (A RAM) This error occurs when the AMux is passing GPRA/SP data to and the parity generated at the AMux output does not match the GPRA/SP byte parity stored in the RAMs. All data and parity written into the GPR/SP comes from the WBus. The EBox generates all WBus parity, even when the IBox and FBox are driving data on the WBus. When the EBox drives the WBus, it compares its internal parity to the parity on the WBus, if they do not match the EBox asserts WBus Parity Error. The EBox Operand Check is done on the EDP Module PDP MCA, and the error signal OPR PAR ERR is sent to the EBD Module where it's latched in the EBCS Register as EBox Data Path PE. The EDPSR Register bits (Operand module PE, A RAM PE, and AMux Byte in Error) are latched on the EDP PDP MCA. ERROR SIGNATURE: EHMSTS <25> EBCS <09> EDPSR EDPSR EDPSR PROBABLE EBox <03> <02> <27:24> SP B to A = EBox Data Path Parity Error = Operand Parity Error = A RAM PE (AMux PE) = AMux Byte in Error (AMux or BMux) CAUSE: Module Probability Components L0209/EDP L0219/EBE L0211/EBD High Medium Very Low (See Table 1 (See Table 2) EBCS Register Table 1 EDP Component Callout Byte Bits RAMs 3 2 1 <31:24> <23:16> <15:08> 0 <07:00> Table 2 E501, E503, E903 E903 E616, E615, E614, E613, PDP Misc E101 EB86 E3 "E3 E13 E13 E903 E102, E87, E70, E69 E3 E81 E903 E53, ES52 E3 E81 EBE Component Callout Byte WBus 3 'E169, 2 1 0 E500, E502, ALU El169, E169, E155, Latches E155, E127, E127, E127, Parity E43, E29 E43, E43, E29, E29, E1l55, El ES85, El 2-66 El Generators E157, E59 E115, E45, E17, ES9 E31 E31 MANUAL MACHINE CHECK STACK FRAME ANALYSIS EHM ACTION: EHM VECTOR: Standard (See Introduction) CSL ACTION: None VMS ACTION: Standard (See Introduction) 8 2-67 MANUAL E-12 MACHINE EBox CHECK Operand Error OVERVIEW: This ALU Match), (WBus the parity is generated generated WBus Match and the WBus Match data 1is ANALYSIS (A WBus) occurs when the AMux is passing WBus data to the and the WBus byte parity to the AMux does not match generated at the output of the AMux. WBus byte parity by the checked occurs in previous EBox on a read on the cases cycle occurs, not FRAME error bits and STACK the EBE module. EDP module. The AMux parity is where GPR/SP data is read as operand data, a write to the same location. When the function takes place where the old (stale) SP RAMs, instead data is selected from the issued bypass from the WBus. When a WBus Match occurs, some data may come from the WBus and some from the GPR/SP. It Depends on the context of the data being written by the previous cycle and the context of the data being read. Therefore, it is possible to have a WBus Error and GPR/SP Error in the same The cycle. EBox Operand error signal OPR Check is PAR ERR done is on the sent to EDP the Module PDP MCA, and the where it's latched The EDPSR Register Dbits in Error) are generated and EBD Module in the EBCS Register as EBox Data Path PE. (Operand PE, A WBus PE, and AMux Byte ‘latched on the EDP module PDP MCA. ERROR SIGNATURE: EBCS <09> = EBox EDPSR <03> = Operand EDPSR <01> = A WBus EDPSR <27:24> PROBABLE CAUSE: AMux Data Path Parity PE Byte Parity Error Error (AMux (AMux) in Probability Components L0O209/EDP LO0219/EBE LO0211/EBD High Medium Very Low (See Table 1) (See Table 2) EBCS Register Byte 3 2 1 0 1 EDP Component Bits <31:24> <23:16> <15:08> <07:00> BMux) Error Module Table or | Callout ALU E102, E87, E70, ES3, SHF E101 E86 E69 E52 - E104, E104, E104, E104, 2-68 E88, E88, E88, E88, E31, E31, E31, E31, ES5 ES ES ES PDP Misc E3 E3 E3 E3 E13 E13 E81 E81 MANUAL MACHINE CHECK STACK FRAME ANALYSIS Table 2 EBE Component Callout Byte Parity WBus Latches E155, E127, = - E43, > W - TM | N ~J E155, T— (o)) O - <) E169, - O~ Ww A E29, E29 E29, E85, El El EHM ACTION: Standard (See Introduction) CSL ACTION: None VMS ACTION: Standard (See Introduction) EHM VECTOR: 8 | 2-69 W G N TN S Generators N S W S W A A R A S R S MANUAL MACHINE E-13 CHECK EBox Operand STACK Parity FRAME ANALYSIS Error (B RAM) OVERVIEW: This error occurs when the BMux is passing GPRB/SP data ALU and the parity generated at the BMux output does not match GPRB/SP byte parity stored in the RAMs. the All data and parity written into the generates all WBus parity, EBox driving data on its internal the WBus. parity if they do not match Check is PAR ERR PDP EBox asserts WBus The EBox Operand signal the GPR/SP comes from the WBus. The when the IBox and FBox are EBox drives the WBus, it compares even to the parity on the WBus, the error When to the OPR Parity Error. done on the is to sent EDP the Module EBD Module MCA, where it's and the latched in the EBCS Register as EBox Data Path PE. The EDPSR Register bits (Operand PE, B RAM PE, and BMux Byte in Error) are latched on the EDP module PDP MCA. ERROR SIGNATURE: EHMSTS <26> EBCS <09> EDPSR <03> EDPSR <00> EDPSR <31:28> PROBABLE = EBox SP A to EBox Data Path Parity Error = OPERAND Parity Error (AMux or = B RAM PE (BMux PE) = BMux Byte in Error Probability L0209/EDP L0219 /EBE L0211/EBD Medium 1 Byte 3 2 1 0 Byte R . 3 2 1 0 2 Components High Very Low EDP Component (See Table 1) (See Table EBCS 2) Register Callout Bits RAMs <31:24> <23:16> <15:08> E715, E713, E815, E714, E712, E814, E813, E907 E907 E907 E812, E102, E87, E70, E907 E53, <07:00> Table BMux) CAUSE: Module Table B EBE Component WBus ALU Latches E101 E86 E69 E3 E3 E3 E52 E13 E13 E81 E3 E81 Parity wmmmmm“wmm E155, E127, E127, E127, Misc Callout mmmmmmmmmm E169, E169, E169, E155, PDP mwmmmmwuum E43, E43, E43, E29, E29. E29, E155, El m- ES85, El El Generators mmflmmmmmm ummmmmmmu E157, E115, E45, E17, E59 ES9 E31 E31 s B MANUAL MACHINE CHECK STACK FRAME ANALYSIS EHM ACTION: EHM VECTOR: Standard (See Introduction) CSL ACTION: None VMS ACTION: Standard (See Introduction) 8 | 2-71 — MANUAL MACHINE E-14 CHECK STACK FRAME EBox Micro Stack Parity ANALYSIS Error OVERVIEW: This error occurs when the EBox Micro~Stack is (Popped) and the Micro-Stack location contains incorrect parity. Micro-Stack parity is generated on the microsequenc er (MIC MCA) data is pushed onto the Micro-Stack. The EBox Micro stack Parity Error module and sent Register. ERROR EBCS to the EBD signal is module where on latched SIGNATURE: <10> PROBABLE = EBox Ustack Parity Error CAUSE: Module Probability Components L0216/CSB. L0216/CSB LO211/EBD High Medium Very Low RAMs - E160, E166, Checker EBCS Register EHM ACTION: EHM §generated it's VECTOR: Standard CSL ACTION: None VMS Standard ACTION: (See Introduction) (See Introduction) 8 2-72 E172, E179 ~ in read The when the CSB the EBCS MANUAL MACHINE CHECK STACK FRAME ANALYSIS E-15 EBox Control Store Parity Error OVERVIEW: This error occurs when a microword is read out of the Control Store RAMs into the data latches. EBox Control Store is CSA and CSB. There are two parity checks, spread across two modules: one for the CSA module and one for the CSB module. The error signals EBox CS parity error are or'ed together on CSB to produce EBox CS PE. latched in the EBCS 1it's where module EBD the to is then sent Register. EBox Control Store Parity Errors are handled differently then are Control Store Parity Errors for the other boxes. The EBox will stall inhibiting clocks to the data latches and thus freeze the data and address in error. The EBox will then interrupt the console using the CPU ERROR CODE lines <2:0> from the EBE module. The console will receive the interrupt and determine that it was caused by an EBox Control Store Parity Error. The Console will then and address over the SDB and try ECC correction read the bad microword the Console will If the correction fails, the microword. on succeeds correction the If re-initialize the CPU and notify VMS. word corrected the write will console then the (single bit error), back into the EBox CS RAMs. Thenit will read the corrected word out of the RAMS and re-check the parity to verify that the RAM location is correct. ERROR SIGNATURE: EBCS <11> CSES <31> CSES CSES CSES <28:16> <15:08> <02:00> EBox Control Store Parity Error | Correctable Error 0 = Error corrected 1 = Unable to correct Control Store Address Syndrome error “1" EBox Control Store Parity Error NOTE If ECC Correction fails the Console will re-initialize the CPU and notify VMS. PROBABLE CAUSE: Module Probability L0216/CSB L0215/CSA L0209/EDP L0206/1IDP L0210/EBC L0207/ICA L0213/FBM L0211/EBD High High Medium w Low Low Low Low Very Low Very Low L0217/L0231/CLK L0219/EBE | Very Low Components , (See Table 1) RAMs (See Table 2) RAMs Receivers Receiver Receivers Receivers Receivers EBCS Reg/Receivers Receiver CPU Error Code 2-73 CSA/CSB PE CSA CSA CSA CSA CSA CSA CSA CSB CSB CSB CSB MANUAL MACHINE CHECK STACK FRAME ANALYSIS Table 1 Syn Phy CSB Control ULD Sig Store RAM Call Name Out (See 1) Note 1/9 2/A 3/B 4/C 5/D 6/E 7/F E106 E106 Elll E11l1 E106 E111 E128 E128 E106 E105 E123 E134 E134 E100 E99 El1l6 E116 E116 E123 E123 03 04 E100 E100 E100 E1l1l1 E110 E116 E115 E123 E122 E128 E128 E127 E134 E134 E123 8 9 05 06 07 08 E99 E99 E110 E110 E110 E115 E115 E115 09 E109 E109 El114 E114 E122 E122 E122 El21 E127 E127 A E105 E105 E105 E104 E104 B 10 11 E109 E109 E49 E49 E49 E98 E98 D 12 E 13 F 14 10 11 12 13 14 15 16 17 18 19 15 16 17 18 19 20 21 22 23 24 USRC1 1A 1B 25 26 USRC1 USRC1 E104 E104 ES54 E54 ES59 E54 ES55 ES5 E55 E60 E60 E55 E123 E123 El21 E126 E126 E132 E132 E114 E114 E43 E43 E43 El21 El21 E38 E38 E38 E126 E126 E31 E31 E31 E132 E132 E49 E43 E50 E44 E50 E44 ES50 E50 E38 E39 E39 E44 E31 E32 E32 E32 E32 USRC1 7 DATA E61 PAR E56 ES51 USRC1 6 E61 E51 USRC1 E56 E56 1C 27 USRC1 1E 28 29 UDEST 1F 20 21 30 31 32 22 23 33 34 24 25 35 36 37 UMCF E69 E61 E61 E62 E69 E62 E69 ES57 E62 ES57 0 UMCF 1 2 UOPSE L UOPSEL USRC2 UMCF O 1 bt USRC2 UDEST UDEST 38 39 E54 USRC1 USRC2 E56 E57 E44 E51 ES51 ES2 E45 E45 E45 E45 E46 E52 E52 E46 E46 E39 E39 E27 E40 E27 E33 E33 E34 E27 E28 E34 E34 E28 E40 E41 E41 E41 E41 E37 E37 E57 E52 E46 ES53 ES53 E48 E48 E42 E65 E65 E91 E58 E96 E96 E96 E102 E102 E48 E48 E107 E42 E91 E91 ES53 ES3 E102 E107 E107 Ell2 E112 E119 E119 E30 E30 E124 El24 E124 E91 E92 E96 E97 E102 E103 E103 E107 E108 E108 E108 E108 E1l12 E113 E113 E119 E120 E125 E120 E125 E120 E120 E125 E125 E92 E97 E92 .E97 E97 E92 E103 E103 E113 E113 E37 E37 E119 NOTE 1. The numbers correspond to above the the bits Estate <18:16> 2-74 in (E ###) CSES. E26 E26 E26 E33 E62 El112 E25 E26 E33 ESS8 ES58 E42 E25 E25 E40 E69 E42 E25 E40 E65 E65 ES58 E123 E127 N E60 E60 1D 26 27 28 E98 E98 ES59 ES59 ES59 > C E99 W 6 7 W =~ N 3 4 5 00 01 02 NO SO 1 2 numbers E27 E28 E34 E28 E30 E24 E30 E24 E124 E24 E24 E130 E130 E130 E130 E131 E131 E131 E131 L MANUAL MACHINE CHECK STACK RN SR R A UJUMP UJUMP 08 09 UJUMP UJUMP 10 UJUMP 10 UJUMP 11 UJUMP 12 usuB 00 UsSuB 01 Module 05 06 UJUMP 07 08 09 USRC1 USRC1 27 28 29 USRC1 00 01 02 03 04 30 31 32 33 34 UMCF 1 UMCF 2 UOPSEL 0 ‘UOPSEL USRC2 USRC2 UDEST O 1 E139 E139 E138 E138 P02 P15 P14 P15 PO1 E20 E21 E21 E21 E21 P02 P14 P15 P02 PO1 A54 E22 B0O6 B56 B10O P02 AS56 AQl A52 ‘A70 A27 Al2 AO03 C75 C85 A58 A59 A68 A62 A64 A63 UDEST USRC1 USRC2 PO1 P14 E138 E138 E20 E20 E20 10 st UMCF P15 P02 PO1 bt UDEST E139 - 25 26 P01 Pl4 P15 P02 P14 et ot et USRC1 7 UPAR 0 USRC1 USRC1 USRC1 E140 E140 E140 E140 E139 et ot ot ot 20 21 22 23 24 Pin > U1 O UBEN UBEN UBEN UBEN UBEN Chip W= N W 16 17 18 19 Slot S > O 15 R UJUMP 00 UJUMP 01 UJUMP 02 UJUMP 03 UJUMP 04 05 06 07 11 12 13 14 A CN - KON C -~ N O TEHOOW O o o B B W N 03 04 > O 0~ 00 01 02 U A o B B B Signal Name S Phy CSB Control Store RAM Terminating Module NO Syn 1A N Table FRAME ANALYSIS NOTE All terminators should measure 56 ohms to -2 volts. MACHINE CHECK STACK FRAME CSA Control Store RAMs Call 0/8 1/9 2/A 3/B 4/C 5/D 6/E 7/F E69 E69 E62 ES56 E50 E62 ES56 E43 E43 E37 Table 2 Syn Phy ULD Sig he 29 2A 2B 2C 2D 42 43 44 2E 2F 45 46 30 31 47 48 49 32 L R Name L L 1O =N QN Wb WU 40 41 W 33 34 35 36 37 38 39 3A 3B 3C 3D 3E 3F 40 41 42 43 44 45 46 1 UVMQK 0 UDEST 2 UDEST 1 47 48 49 UALU 4A 4B UMARK 4C 4D 4E 4F 50 0 UBMX UVMQK 4 3 2 UALU UALU 0 UALU 1 UALU O=~=NWOoO - UCCK UCCK UCCK UCCK UDT 1 UDT O ANALYSIS Out (See Note 1) ) > MANUAL UFBOX 1 UFBOX 0 USRC2 5 E82 E75 E82 E75 E82 E82 E83 E75 E76 E83 E76 E83 E76 E83 E84 E84 E76 E77 E77 E75 E69 E69 E70 E70 E70 E70 E62 E56 E62 E56 E63 ES7 ESO ES0 ES0 ES51 E63 E63 E57 "E51 E57 E51 E57 ES51 E52 E52 E37 E43 E37 E43 E44 E37 E38 E44 E44 E38 E38 E44 E38 E39 E39 E71 E63 E64 E64 E64 ES58 ES52 E45 E64 E65 E65 E65 E58 E59 ES59 E59 E39 E78 E71 E71 E72 E72 E72 E52 E53 ES53 ES3 E45 E46 E46 E46 E39 E40 E40 E40 E85 E78 E72 E65 E86 E86 E59 E79 E53 E73 E46 E66 E60 E79 E54 E73 E66 E47 E86 E86 E60 E40 E41 E79 E79 E54 E73 E73 E66 E47 E60 E60 E54 E54 E47 E84 E84 E85 E85 E85 E77 E77 E78 E78 E71 E32 E33 E32 E33 E32 E32 E87 E33 E80 E27 E27 E27 E27 E74 E87 E87 E80 E80 E87 E119 E119 E66 E28 - ES8 ES8 E67 E34 E34 E34 E34 E61 E74 E74 E67 E67 E80 E125 E125 E74 E132 E132 E119 E119 E118 E125 E125 E124 E118 E118 E45 E45 E47 E41 E41 E41 E29 E35 E36 E29 E35 E36 E29 E29 E55 E35 E35 E48 E36 E36 E42 E61 E61 E55 ES55 E48 E48 E67 E138 E138 E42 E42 E61 ES55 El44 E144 E151 E151 E48 E157 E157 E42 El64 El64 E132 E132 E131 E138 E138 E137 E124 E124 E1l44 El44 E143 E131 E131 E151 E151 E150 E137 E137 E143 E143 E157 E157 E156 El64 E164 E163 E150 E150 E156 E156 E163 E163 E118 E117 E117 E117 E117 E124 E123 E123 E123 E123 E131 E130 E130 E130 E130 E137 E136 E136 E136 E136 E143 El142 E142 E142 E142 E150 E149 E149 E149 E149 E156 E155 E155 E155 E155 El63 El162 E162 E162 E162 E116 El1l6 E116 E122 E122 E122 E129 E129 E129 E141 E122 E121 E129 E128 E148 E148 E148 E154 E154 E116 E115 E135 E135 E135 E135 E134 El61 El61 El61 E33 E28 E28 E28 E1l41 E141 E1l41 E140 E148 E147 E154 E154 E153 El61 E160 MANUAL MACHINE CHECK STACK Table 2 CSA Control Syn Phy 56 57 58 42 41 44 56 55 USRC2 USRC2 UDEST 59 5A 85 86 87 88 89 5B 90 54 ULITCTL 1 5C 91 Store RAMs Call Out ULD Signal 53 Name 4 3 0 ULIT 1 ULIT O ULITCTL O (cont.) FRAME ANALYSIS (See Note 1) 0/8 1/9 2/A 3/B 4/C 5/D 6/E 7/F E115 E115 E115 E114 El114 E121 El121 E121 E120 E120 E128 E128 El128 E127 E127 El1l34 E134 El134 E133 E133 E140 E140 E140 E139 El139 E147 El47 El1l47 El46 El46 E153 E153 E153 E152 E152 E1l60 E1l60 El60 E159 E159 El114 E120 E127 E133 E139 E146 E152 E159 E114 E120 E127 E133 E139 El46 NOTE l. The numbers above the Estate (E ###) correspond to the bits <18:16> in CSES. Table 2A Syn Phy ULD Signal 29 2A 2B 2C 2D 40 41 42 43 44 60 59 58 57 25 ULIT ULIT ULIT ULIT UAMX 2E 2F 30 31 32 45 46 47 48 49 52 51 50 68 67 USCK USCK USCK UMCF UMCF 33 34 35 36 37 50 51 52 53 54 66 83 91 90 89 38 CSA Control Name Store RAM Terminating Module Slot 5 4 3 2 EDP EDP EDP EDP EDP 10 10 10 10 10 A89 AQ4 A78 A88 B64 2 1 0 5 4 EDP EDP EDP CSA CSA 10 10 10 3 A76 A84 AB86 E20 E20 UMCF 3 UPAR SPARE 0 SPARE 1 SPARE 2 CSA CSA CSA CSA CSA 3 3 3 3 E20 E20 Al2 A06 AQ8 3 Chip 39 56 55 88 81 UMISC SPARE 3 3 CSA EDP 10 3 A38 3A 3B 3C 57 58 59 80 79 78 UMISC UMISC UMISC 2 1 0 EDP EDP EDP 10 10 10 B60 ASS5 A46 3D 60 87 SPARE 4 CSA 3 A0l 3E 3F 40 41 61 62 63 64 86 85 84 29 SPARE 5 CC SYNC NODEST USMI CSA ICA ICA EBD 3 13 13 6 2-77 B66 A04 B78 A0S A40 Module Pin POl P15 P02 Pl4 numbers E152 E159 MANUAL MACHINE Phy CSA Control ULD - 42 43 44 45 46 65 66 67 68 69 47 70 48 49 4A 4B 71 72 73 74 26 28 27 46 45 24 23 22 69 21 G N ORI O SN FRAME Store Signal RN Name NN ANALYSIS RAM Terminating Module Module Slot 1 UALU 4 UALU 3 UALU 2 A50 A44 B82 B38 B45 B30 UMARK UALU 1 UALU 4C 4D 4E 4F 50 75 76 77 78 79 20 75 74 UCCK 73 72 UCCK UCCK 51 52 53 54 55 80 81 77 76 UDT UDT 82 83 84 71 UFBOX 70 43 UFBOX 56 57 58 59 5A 85 86 87 88 89 42 41 44 56 55 5B 5C 90 91 54 53 B32 B18 B20 A20 B22 O UCCK 1 O USRC2 UDEST ULIT ULIT Cé69 C80 B69 B64 B54 O Wb USRC2 USRC2 Pin B91 A57 UDEST 2 UDEST oy, B58 B68 0 UVMQK Chip (cont.) WSNE WRS 10 NWO Syn 2A STACK UV O Table CHECK B12 A73 A48 BO3 A80 1 O ULITCT L1 ULITCTL O A91 A94 NOTE All EHM ACTION: VECTOR: EHM terminators Standard 8 CSL ACTION: Standard hardware int errupts should measure (See 56 ohms ACTION: Standard (See Introduction) the (See -2 volts. Introduction) In addition, since console directly, SDB) generate CSPE Reset on the CSB module. microcode to re-start the EHM at vector 8. VMS to Introduction) the console will This will cause the - EBox (via the the EBox MANUAL MACHINE CHECK STACK FRAME ANALYSIS E-16 EBox MCF RAM Parity Error OVERVIEW: This error occurs when even parity 1is detected at the output of the EBox Memory Control Field (MCF) RAMs on the EBC module. These RAMs are accessed every time a new EBox microword is read. All unused locations are loaded with even parity, so addressing an unused location will result in a parity error. The MCF RAM parity check is done on the signal is sent to the EBD module EBC where module, it's and latched the in the error EBCS Register. ERROR SIGNATURE: EBCS EBCS <12> <04> = = EBox MCF RAM Parity Error EBox Abort Flag (Note 1) Note 1. PROBABLE The EBox Abort Flag will set if IRD LST CYC (uBEN field) is set when the error occurs. This flag indicates the error was detected too late to inhibit the PC from being updated and thus, prevents an instruction retry. CAUSE: Module Probability Components L0210/EBC L0211/EBD Medium Very Low Latches/Checkers EBCS Register EHM ACTION: EHM VECTOR: Standard 8 CSL ACTION: None VMS ACTION: Standard (See Introduction) (See Introduction) 2-79 MANUAL I-01 MACHINE CHECK IBox Control OVERVIEW: control across STACK Store IBox Control store its data 51-bit FRAME ANALYSIS Parity Error Store parity latches. The is checked IBox field. An IBox Control Store Parity Error will both the Control Store Data Latches and at Control inhibit the UPC the output Store has of Fhe odd parity further clocking Save register. At of the same time it will cause a microtrap to the EHM. The EHM will begin building a Machine Check Stack Frame, determine that it was called to handle an IBox CS PE, and set EBCS <27>. This in turn will interrupt the Console by asserting CPU ERROR <2:0> lines from the EBE module. The Console will correct the parity error and return control to the EHM,. The EHM will continue building the Stack Frame and call the VMS Machine Data CS Check flows Handler. from Parity the ICA Module CS RAMs, to the CS Data Latches, to the and then to the IBox IDP, IBD, and ICB Modules. Parity Error signal (ICS PE) goes from the ICA Checkers The Control Store Module to the EBE Module where it is latched in IBESR <21>. ERROR SIGNATURE: EBCS <K13> = IBox Error IBESR <21> CSES = ICS <K31> Parity = Correctable 0 1 Error Error = Error corrected = Unable to correct error CSES <28:16> Control CSES <«15:08> CSES Syndrome <02:00> "2" PROBABLE CAUSE: ICS Store Address Parity Error Modules Probability Components L0207/ICA L0207/1ICA High Medium Medium RAMs (see Table 1) CS Data Latches/Parity Checker Receivers L0206/1IDP L0214/ICB L0208/IBD LO0219/EBE Low Low Very Low Receivers Receivers IBESR Register 2-80 e MANUAL MACHINE CHECK STACK FRAME ANALYSIS Table 1 IBox Control Store RAb -C&ll@%?f Syndrome N W A R W R Phy W SR ULD Signal Name R S A SR G R N R O RAM - . R S 2 00 01 24 25 GPR SEL O GPR SEL 1 El6 E16 6 7 8 9 A 05 06 07 08 09 17 21 22 23 18 AMUX SEL 2 CTX CTL O CTX CTL 1 CTX CTL 2 BMUX SEL O E13 E13 E13 El4 E1l5 B C D E F 10 11 12 13 14 19 20 37 38 39 BMUX SEL 1 BMUX SEL 2 UMISC 0 UMISC 1 UMISC 2 E15 E15 E15 E1l4 E1l4 10 11 12 13 14 15 16 17 18 19 40 10 11 12 13 UMISC IFORK IFORK IFORK UTRAP O 1 2 O El4 E36 E36 E36 E35 15 16 17 18 19 20 21 22 23 24 14 08 09 41 42 UTRAP CTL O UBEN CTL O UBEN CTL 1 CYCLE ID O CYCLE ID 1 E35 E35 E35 E36 E12 1A 1B 1C 1D 1E 25 26 27 28 29 29 30 33 34 35 OPV CTL O OPV CTL 1 UMCF O UMCF 1 UMCF 2 1F 20 21 30 31 32 36 43 45 E1l1 E1ll EA40 22 23 33 34 27 28 UMCF 3 UNSTALL DIAG UNSTALL UCODE WREQ REG MODE 24 25 26 27 28 35 36 37 38 39 00 01 02 03 04 NA NA NA NA NA E40 E39 E39 E39 E39 29 2A 2B 2C 2D 40 41 42 43 44 05 06 07 31 32 NA 5 NA 6 NA 7 UNPACK CTL 0 UNPACK CTL 1 1 3 4 5 02 03 04 26 15 16 GPR SEL 2 AMUX SEL 0 AMUX SEL 1 3 CTL CTL CTL CTL E1l6 E16 E13 - El12 E12 E12 E1l1l Ell Q 1 2 3 . 4 2-81 E40 E40 E37 E37 E38 E38 E38 MANUAL MACHINE Table 1 IBox STACK Control FRAME ANALYSIS Store RAM Callout Syndrome Phy ULD Signal A e 0 R 46 44 50 47 48 49 IBOX MARK INH IBF SHIFT ICS OPAR UMISC2 0 UMISC2 1 UMISC2 2 G R N SRR S WS 2E 2F 30 31 32 33 EHM ACTION: EHM CHECK VECTOR: e BB 45 46 47 48 49 50 LR R R Name R T e— RAM A E38 E37 E37 E17 E17 E17 Standard (See Introduction) 10, 18, 1E, or 1F CSL ACTION: Standard (See Introduction) VMS Standard (See Introduction) ACTION: 2-82 (cont.) MANUAL MACHINE CHECK STACK FRAME ANALYSIS I-02 IBox IDRAM Parity Error OVERVIEW: The IBox DRAM has odd parity across its 20-bit field. When a DRAM parity error occurs the failing Address and Data will be held in the ERR/LD ADRS register, and a microtrap to the EHM will occur. The EHM will begin building a Machine Check Stack Frame, determine that it was called to handle an IBox DRAM PE, and set EBCS <28>. This in turn will interrupt the console by asserting CPU ERROR <2:0> 1lines from the EBE Module. The Console will correct the parity error and return control to the EHM. The EHM will continue building the Stack Frame and call the VMS Machine Check Handler. Data flows from the IBD Module DRAMs to the DRAM Data Latches, then to two IOPA MCAs (E7, E20) which do the parity checking. parity error signal from IOP MCA is fed through the DBS MCA, to ICB Module, IBESR and then to the EBox EBE Module where it is <22>. ERROR SIGNATURE: EBCS <13>' = IBox Error CSES CSES CSES <28:16> <15:08> <02:00> PROBABLE CAUSE: = IDRAM Parity Error = Correctable Error 0 = Error corrected wouwn IBESR <22> CSES <K31> 1l = Unable to correct error Control Store Address Syndrome "3" IDRAM Parity Error Modules Probability Components L0208/1IBD L0208/1IBD High Medium RAMs (see Table 1) DRAM Data Latches/Checkers(IOPA) L0219/EBE Table Very Low 1 IBox IBESR Register DRAM Callout Syndrome Bit Signal Name Dram AR W R e VA N A O 00 91 E82 ES3 D A 1 2 3 TR A A DO0 DO1 N N S RO R NAT ADRS NAT ADRS N O - 4 5 D03 D04 NAT ADRS 03 NAT ADRS 04 NAT ADRS 02 ES5 6 7 8 9 D05 D06 D07 DO8 D09 NAT NAT NAT NAT NAT E54 E81 E52 E88 E79 A D02 R ADRS 05 FPA OPAR LAST BDEST NXT E80 E56 and The the latched in MANUAL MACHINE Table 1 CHECK IBox Syndrome Bit STACK DRAM FRAME Callout Signal Name ANALYSIS (cont.) Dram B D10 NAT SUSPEND C E78 D11 NAT CTL 0 E74 D D12 NAT CTL 1 E75 E D13 F D14 NAT 10 D15 NAT TYPE 0 11 12 13 14 D16 D17 D18 D19 NAT REF 0 REF NAT CTX NAT CTX NAT CTX EHM ACTION: Standard EHM VECTOR: 10, 18, (See 1lE, or 1 0 1 2 ES51 E77 E95 E96 E89 Introduction) 1F Standard (See Introduction) VMS Standard (See , Introduction) - | | E76 CSL ACTION: ACTION: | ES50 1 NAT TYPE ~, ~ " - | | ~ ~ | TM MANUAL MACHINE CHECK STACK FRAME ANALYSIS I-03 IBox IAMux WBus PARITY OVERVIEW: This error occurs when a WBus Match occurs in the IBox and a WBus byte parity error is detected at the output of the IBox AMux. Parity is only checked on the bytes consumed. A WBus Match will occur when the GPR data required for an operation is in the process of being updated (written) by the previous operation. Instead of waiting for the previous operation to complete the GPR write cycle the data is taken directly off the WBus. | The EBox generates all WBus and GPR parity bits on the EBE IAMux EC <1:0> indicates the most significant byte in error. The IAMux Parity Error signal WBus DATA latched ERROR in go from the (IAMux PE), IDP module to IAMux EC the SIGNATURE: <13> IBESR <30:24> IBESR <28> IBESR <23> PROBABLE IBox Error IAMux EC <1:0> (Byte "1" IWBus Data IAMux Parity Error = = = in Error) CAUSE: Modules Probability L0206/IDP Table EC <1:0> 0 0 1 1 2 2 3 3 Components High LO0219/EBE L0219/EBE IAMux = IAD, Medium Very low 1 DPPA (see Table 1) WBus Parity Generators IBESR Register IAD MCA Callout Data IAD Bit Byte Slice MCA 0 0 1 1 2 2 3 3 0 1 2 3 4 5 6 7 El E2 El E2 ES E7 E8 ES8 0-3 4-7 8-11 12-15 16-19 20-23 24-27 28-31 EHM ACTION: Standard EHM VECTOR: 10, CSL ACTION: Standard 18, (See 1lE, or (See Introduction) 1lF Introduction) 2-85 and EBE module where IBESR. EBCS <1:0>, Module. IAMux they are MANUAL MACHINE CHECK VMS ACTION: Standard STACK (See FRAME ANALYSIS Introduction) 2-86 MANUAL MACHINE CHECK STACK FRAME ANALYSIS I-04 IBox IAMux GPR Parity Error OVERVIEW: This error occurs when the IBox AMux is passing GPR data to the Data Path Adder, and the parity generated at the output of the IAMux does not match the GPR Byte parity stored in the RAMs. Parity is only checked on the bytes consumed. Parity is not checked when a GPR is used as an index register [Rx]. The EBox generates all WBus and GPR parity bits on the EBE IAMux EC <1:0> indicates the most significant byte in error. The IAMux Parity Error signal (IAMux PE), and IAMux EC <1:0> Module. | go the IDP module to the EBE module where they are latched in IBESR. ERROR SIGNATURE: EBCS IBESR <K13> <28> = IBox Error = "0" GPR Data IBESR <23> = "]1" IAMux Parity Error IBESR <30:29> IAMux EC <1:0> (Byte in Error) PROBABLE CAUSE: Modules Probability Components L0206/1IDP L0206/1IDP L0219/EBE LO219/EBE High Medium Low Very low RAMs (see Table 1) IAD, DPPA WBus Parity Generator IBESR Register Table IAMux EC <1:0> 0 0 1 1 2 2 3 3 -~ 1 IBox GPR RAM Callout Data IAD Bit Byte 0-3 4-7 8-11 12-15 16-19 20-23 24-27 28-31 Slice 0 0 1 1 0 1 2 3 4 5 6 7 2 2 3 3 RAM MCA El16 E19 E29 E32 El E2 E1l E2 E20 E20 E24 E24 E65 ES58 E69 ES8 E74 E71 E75 E71 EHM ACTION: Standard (See Introduction) CSL ACTION: Standard VMS ACTION: Standard (See Introduction) EHM VECTOR: 10, 18, 1lE, or 1F (See Introduction) ES E7 E8 E8 from MANUAL MACHINE I-05 CHECK STACK FRAME ANALYSIS IBox RLog Parity Error OVERVIEW: RLog parity is checked during an RLog unwind operation. The RLog is written during autoincrement, autodecrement, autoincrement deferred, EXC only, and for the first specifier of all instructions regardless of the addressing mode. Data flows from the ICB Module RLog RAMs, to the RLog Latches, to the parity checkers, and to the IDP Module DPPB and SPA MCAs. The RLog Parity Error signal (RLog PE) goes from the ICB Module to the EBE Module where it is latched into IBESR. ERROR SIGNATURE: EBCS <K13> EBCS IBESR <K03> = State Modified <24> = RLog Parity Error PROBABLE = IBox Error CAUSE: Modules Probability Components L0214/ICB L0214/ICB L0206/IDP L0219/EBE High Medium Low Very low RAMs (E162, E163, E164) RLog Latches, Generator, DPPB, SPA IBESR Register EHM ACTION: The EHM set Machine State Modified EBCS EHM VECTOR: 1F CSL ACTION: None VMS ACTION: Standard processes the error in the normal manner. (See Introduction) 2-88 Standard Checker <03>, (See and then Introduction) MANUAL MACHINE CHECK STACK FRAME ANALYSIS I-06 IBox IBuffer Parity Error OVERVIEW: Specifier IBuffer Parity is checked on the Opcode Byte 0, on the Byte 1, and on the bytes <6,4,3,2> when selected by the RMode Finder during optimization. | Opcode <B0>, or Specifier <B1l> Parity is checked when the byte is If a error is detected in IBuf bytes <B1:B0>, the setting of valid. the buffer valid flags are inhibited, and Opcode Valid into the Opcode | | Buffer is negated. | RMode Parity Error detection 1is enabled only when a successful instruction optimization is performed. The byte selected by the RMode Finder is checked, and if a parity error is detected, the setting of the buffer valid flags are inhibited. Data flows from the MBox MCD Module MDP MCAs, to the IBox IBD Module and then to the IOP MCAs for parity checking. The IBuf IBF MCAs, to Parity error signal from the IOP MCA is fed through the DBS MCA, the ICB Module, and then to the EBox EBE Module where it is latched in IBESR. | NOTE a 1is If MBox Interrupt EBCS <14> is set, the failure Therefore this the MBox sending bad data. result of Instead check for one of the error should be ignored. following errors: MSTAT2 <14> - ABus Bad Data Code MSTAT1 <03> - Cache Data Parity Error <22> MDECC - Bad Data Error MDECC <20> - Data Double Bit Error MDECC <19> - Data Address Parity Error ERROR SIGNATURE: EBCS = IBox IBESR <25> = <«13> IBuf EBCS "0" <14> PROBABLE = Error Parity Error No MBox Interrupt CAUSE: Modules Probability Components L0208/IBD High IBF L0208/IBD Medium IOPB, L0204/MCD Low MDP L0219/EBE Very Low IOPC IBESR Register 2-89 MANUAL MACHINE CHECK EHM ACTION: Standard EHM 10, VECTOR: 18, STACK (See 1E, FRAME ANALYSIS Introduction) or 1F CSL ACTION: Sténdard (See Introduction) VMS ACTION: Standard (See Introduction) )'MMM% 2-90 MANUAL MACHINE CHE CK STACK I-07 IBox IBMux Parity FRAME ANALYSIS Error OVERVIEW: IBMux odd parity is checked on Ins truction Stream IBUF Operand bytes <5:1>, and on Indirect Addres The parity that ses from the IMD is checked at the . output of the IBM from the MD Byus ux Latch originate s Data from ® Data flows from the MBox MCD Mod ule MDP MCAs, to IBF the IBox MCAs and 'DBS 1IBD Module MCAs, then to the IDP Module IAD MCA Parity Error si S. The IBMux gnal (IBMux PE) goes from the IDP Module where it Module to the EBE is latched in IBE SR. | NOTE If MBox Interrup t result of the error should be EBCS MBox <14> is sending ignored. following errors : set, bad the data. Instead check MSTAT2 <14> - ABus Bad Data Code MSTAT] <03> - Cache Da ta Parity Error MDECC <22> - Bad MDECC <20> - Data Double MDECC <19> - Data Address Data Error Bit Error Parity ERROR SIGNATUR E: EBCS IBESR EBCS <13> <26> <14> = = = IBox Error IBMux Parity "g" No MBox Error Interrupt PROBABLE CAUSE: Modules L0208/IBD Probability High IBF, L0206/1IDP Medium LO0219/EBE Very Low L0204/MCD Components IAD, Low MDP DBS DPPB IBESR Register EHM ACTION: Standard (See Introdu ction) CSL ACTION: Standard (See Introduction) VMS Standard Introduction) ACTION: (See 2-91 Error failure Therefore for one is of g this the MANUAL MACHINE CHECK STACK FRAME ANALYSIS F-01 FBox Self Test Error not nostic when it 1iscuting ag di or mic t tes f sel a s run x FBo OVERVIEW: The (I.€es the CPU is exe busy executing FBOX instructions. ctions (e.g.. MOV, TSTB). non-floating point accelerator instru XERlR g <02(FB fla or Err t Tes f gel the ed, ect det is or wil Err > t Tes f l set. When the self Test Error flag sets FBXfERRTest. If a Selwil I1f<18>) Selerwise the FBA the failed s ule uleMod xtheMod FBo two whi>ch 1isof the te <02 ica Oth ind . led fai FBM set then ERR FBX Module failed. FBoOX ed ll ca nal sig a tes era gen it or err an s ect EBox requests the result of an FBOX operation(FB1tR whenbleanm. FBoxWwhdet en x the | FBA module Pro blem goes from the on Problem. FBoOX Procau ati loc to p tra rowill detect FBo mod mic OX EBo a ses it re whe , ule EBD MCA) to the Service routine. 2, the entry vector for the FBox set : ther FBox Problem was whe ine erm det l wil e tin rou ion e ept vic The FBox Ser error or because of some other FBox exc dware by 2ero). If the FBOX detected an error e of(i.ae., har becaus ide l set FBox Service Request (EHSR <28>) and Div condition e wil tin Rou e vic FBox Ser call the EHM. EHMSTS <28> FBXERR <00> FBXERR <18> FBXERR <02> wononou ERROR SIGNATURE (FBA Module): FBox Service Request (Note 1) FBox Problem Self Test Error FBA Module PROBABLE CAUSE (FBA Module): Module Probability m“ L0O212/FBA High mmwmmwmmm L0213/FBM mmmwm—uflmm“ Low EHMSTS <28> FBXERR <00> FBXERR <18> FBXERR <02> nuwunn ERROR SIGNATURE (FBM Module): FBox Service Request (Note 1) FBox Problem Self Test Error FBM Module PROBABLE CAUSE (FBM Module): Module Probability LL0213/FBM High “fl”“fl”““w L0212/FBA flmmmmwmflmmm Low 2-92 . U\amm% MANUAL MACHINE CHECK STACK FRAME ANALYSIS Note 1. FBXERR is only valid if EHMSTS <28> is set EHM ACTION: Standard (See Introduction) EHM VECTOR: 2 CSL ACTION: None VMS ACTION: Standard (See Introduction) 2-93 MANUAL MACHINE F-02 CHECK STACK FBox GPR Parity Error OVERVIEW: The Scratchpad FRAME ANALYSIS RAMs in the FBox use odd parity. They can be written from the WBus (SOP MCA) or from the Fraction Adder (FAD MCA). The parity bits from the WBus and Fraction Adder are selected for input to the RAMs on the RRC MCA. The output parity from the RAMs is checked on the FBR MCA. The GPR PE error signal is latched in FBox Register 1 on the FBR MCA. When an FBox detects an error it generates a signal called FBox Problem. When the EBox requests the result of an FBox operation it will detect FBox Problem. FBox Problem goes from the FBA module (FBR MCA) to the EBD module, where it causes a EBox micro-trap to location 2, the entry vector for the FBox Service routine. The FBox Service routine because of a hardware will determine whether FBox Problem was set error or because of some other FBox exception condition (i.e., Divide by Zero). If the FBox detected an error FBox Service Routine will set FBox Service Request (EHSR <28>) call the EHM. | the and ERROR SIGNATURE: EHMSTS FBox Service FBox Problem (Note GPR Parity Error <28> FBXERR <00> FBXERR <K17> 1) Note 1. PROBABLE FBXERR is only valid if EHMSTS <28> is set CAUSE: Modules Probability Component L0212/FBA L0212/FBA L0219/EBE High Medium RAMs RRC, FAD, Low WBus EHM ACTION: The the error in EHM VECTOR: the standard manner. 2 CSL None ACTION: VMS ACTION: EHM Standard reloads (See the SOP, Parity FBox FBR Generator Scratch (See Pads and then processes Introduction) Introduction) g 2-94 MANUAL MACHINE CHECK STACK FRAME ANALYSIS F-03 FBox FDRAM Parity Error OVERVIEW: The FDRAM in the FBox consists of 8 bits <7:0>. The FDRAMs are located on the FBM module. 0dd parity checking on FDRAM data is done on the MCB MCA. The FDRAM parity error signal from MCB MCA is latched in FBox Register 2 on the FBA module FBR MCA. When an FBox detects an error it generates a signal called FBox Problem. When the EBox requests the result of an FBox operation it will detect FBox Problem. FBox Problem goes from the FBA module (FBR MCA) to the EBD module, where it causes a EBox micro-trap to location 2, the entry vector for the FBox Service routine. The FBox Service routine will determine whether FBox Problem was set because of a hardware error or because of some other FBox exception condition FBox call (i.e., Service the Divide by Routine Zero). will If set the FBox detected an FBox Service Request EHM. (EHSR error the <28>) and | EHM begin building a Machine Check Stack Frame, determine that it was called to handle an FBox DRAM PE and set EBCS <29>. This in turn will cause a Console interrupt by asserting the CPU ERROR <2:0> lines from the EBE module. The Console will reload the FDRAM and return control to the EHM. The EHM will continue building the Stack Frame and then call the VMS Machine Check Handler. SIGNATURE: <28> <00> FBXERR CSES <19> <31> CSES <02:00> W u uu EHMSTS FBXERR ] ERROR FBox Service Request FBox (Note 1) PROBLEM FDRAM PE Correctable Error 0 = Error corrected 1l = Unable to correct error "4" FDRAM Parity Correction Note l. PROBABLE FBXERR is only valid if EHMSTS <28> is CAUSE: Module Probability Component L0213/FBM High Medium Low RAMs MCB L0213/FBM L0212/FBA FBR (See Table 1) set MANUAL MACHINE Table 1 DRAM Callout " AR B Bit A S 0 1 CHECK Signal Name TN W R N N A L A STACK - - E39 E38 2 FORK ADDR 2 E39 3 FORK ADDR 3 E38 4 5 6 7 FORK ADDR 4 FORMAT 0 FORMAT 1 E39 E38 FDRAM E38 PARITY ANALYSIS RAM A FORK ADDR 0 FORK ADDR 1 FRAME E39 EHM ACTION: Standard (See Introduction) CSL ACTION: None VMS Standard EHM VECTOR: ACTION: 2 | (See Introduction) 2-96 MANUAL MACHINE CHECK STACK FKRAME ANALYSIS F-04 FBox FBA CS Parity Error OVERVIEW: Control Store Parity in the FBox is checked by the MCA that actually uses the microcode bits. The partial parity outputs from these MCAs are routed through to the ACC MCA which makes the final check for odd parity. 1If a parity error exists the failing address is held on the MSQ MCA, so the console can perform an ECC correction process on the location. The FBA CS PE error signal from ACC MCA is sent to the FBA Module (FBR MCA) where it is latched in FBox Register L When an FBox detects an error it generates a signal called FBox Problem. When the EBox requests the result of an FBox operation it will detect FBox Problem. FBox Problem goes from the FBA module (FBR MCA) to the EBD module, where it causes a EBox micro-trap to location 2, the entry vector for the FBox Service routine. The FBox Service routine will determine whether FBox Problem was set because of a hardware error or because of some other FBox exception condition (i.e., Divide by Zero). If the FBox detected an error the FBox Service call the EHM. The EHM will Routine will set FBox Service begin building a Machine Check Request Stack (EHSR <28>) Frame, and determine that it was called to handle an FBA CS PE and set EBCS <30>. This in turn will cause a Console interrupt by asserting the CPU ERROR <K2:0> from the EBE Module. The Console will correct the parity error and return control to the EHM. The EHM will continue building the Stack Frame and then call the VMS Machine Check Handler. ERROR SIGNATURE: onou <31> w CSES CSES CSES <28> <20> <28:16> <15:08> <02:00> wounu EHMSTS FBXERR CSES FBox Service Request FBA CS Parity Error (Note 1) Correctable Error 0 = Error corrected 1l = Unable to correct error Control Store Address - Syndrome ~"5" (FBA CS Parity Error) Note 1. FBXERR 2. In all cases where ECC correction fails re-initialize the CPU and notify VMS. PROBABLE _ ':, is only valid if EHMSTS <28> is set the Console CAUSE: Module Probability L0212/FBA HIGH L0212/FBA LOW Components RAMs (See Table 1) ACB, ACC, 2-97 ACL, ALN,FBR, MSQ, RRC will MANUAL MACHINE Table 1 FBox Syndrome W W G o - CHECK (FBA) STACK FRAME ANALYSIS Control Store RAM Callout Signal Name B - UROTK O UAUXK 1 UFADK 1 UFARAK 0 UEALU 2 E17 E68 E68 E74 E48 U, UJUMP 4 UEALU 4 UHMX 2 E23 E57 E65 USOPK 1 UBEN 0 E65 E48 UJUMP 7 UHMX 1 UFADROTK 0 UHMX 0 UJUMP O UEALU 0 El2 E63 E28 El4 E37 E7 UJUMP 5 E25 E60 ES58 ES58 E12 ACL E28 E63 E17 E17 USIGNIF UKHMX UFADROTK UFARAK . UWRT 1 1 SPAD E37 RRC RRC UROTK 1 UEALU 1 UBSIDE 1 UAUXK O E60 ACB E28 E54 E18 FBR ACB E7 E28 E56 ALN E1l4 UFADK E56 ACL E63 E10 E10 E18 E54 E49 RRC E17 MSQ "E49 ACL E20 E20 E13 MSQ E9 FBR E30 El ACC O USEL RA O USEL RA 2 UBSIDE O UEALU 3 UBEN 1 UFADROTK UJUMP 6 UJUMP 1 USEL RA UDW 2 UDRTY O USHFTX UBEN 2 UBEN 2 1 PARITY UJUMP 2-98 E63 E69 E66 E66 E25 UFADSIG | E49 ES7 E69 3 RRC ACB FBR MSQ RRC MSQ ACC El E2 ‘MSQ MSQ 5 MANUAL MACHINE CHECK STACK FRAME ANALYSIS Table 1 FBox (FBA) Control Store RAM Callout Syndrome - 2C 2D 2E 2F 30 Phy ULD 43 08 44 45 46 47 EHM ACTION: Standard EHM VECTOR: 2 CSL ACTION: None VMS ACTION: Standard 03 19 25 43 Signal Name RAM FA22 UJUMP 8 FA23 FA18 FA20 FA19 UJUMP UROTK USOPK UDRTY (See Introduction) (See Introduction) (cont.) 3 2 O 1 El13 E30 E9 E8 E2 MCA MSQ MSQ ACB ACC ACC E37 E37 E28 E3 E3 MANUAL MACHINE F-05 FBox OVERVIEW: actually CHECK STACK FBM Control Control uses Store Store the FRAME ANALYSIS Parity Parity in micro-code Error the FBox bits. The is checked by partial parity the MCA that outputs from these MCAs are fed through to the MPZ MCA which makes the final check for odd parity. 1If a parity error exists the failing address is held on the MSQ MCA, so the console can perform an ECC correction process on the 1location. The FBM CS PE error signal from MPZ MCA is sent to the FBA Module FBR MCA where it is latched in FBox Register 3. When an FBox detects an error it generates a signal called FBox Problem. When the EBox requests the result of an FBox operation it will detect FBox Problem. FBox Problem goes from the FBA module (FBR MCA) to the EBD module, where it causes a EBox micro-trap to location 2, the entry vector for the FBox Service routine. The FBox Service routine will determine whether FBox Problem was set because of a hardware error or because of some other FBox exception condition FBox (i.e., Service call the Divide Routine EHM. by Zero). will If set - the FBox FBox detected Service Request an (EHSR error the <28>) and The EHM will begin building a Machine Check Stack Frame, determine that it was called to handle an FBM CS PE and set EBCS <30>. This in turn will cause a Console interrupt by asserting the CPU ERROR <2:0> from the EBE Module. The Console will correct the parity error and return control to Frame and ERROR SIGNATURE: <21> CSES <31> CSES <28:16> CSES <15:08> <02:00> CSES nu FBXERR wu <28> <00> u EHMSTS FBXERR the call o then EHM. the VMS The FBox Service FBox Problem FBM EHM Machine will continue Check building the Stack Handler. Request (Note Control Store Correctable Error 1) Parity Error 0 = Error corrected 1l = Unable to correct error Control Store Address (Note 2) Syndrome "6" FBM Control Store Parity EHMSTS <28> Error Note 1. FBXERR is only 2. In all cases where ECC correction fails re-initialize the CPU and notify VMS. valid if 2-100 is set the Console will MANUAL MACHINE CHECK STACK FRAN E ANALYSIS PROBABLE CAUSE: ‘Module W - W W . T IR S S L0213/FBM L0213/FBM Table I I R W W FBox T O W S RAMs MAX, Low 1 S (FBM) N R W RN N U A N N A S SN A RO AN A (See Table 1) MCL, MPR, MPZ, Control Store Signal ORI o RO R R W G I S R SR N S Phy ULD RAM 1 2 00 01 20 25 FM15 MPLR SEL 2 FM15 CRY CTL O E13 E18 MPR MCL E9 E42 3 4 5 02 03 04 15 16 34 FM14 MHLD SEL FM14 MHLD SEL FM14 MSMX CTL 2 3 1 E18 El6 E15 MPR MPR MAX E9 E9 E63 6 7 8 9 A 05 06 07 08 09 17 14 13 22 18 FM14 FM14 FM14 FM15 FM14 MCAND SEL MHLD SEL 1 MHLD SEL 0O QOBMUX SEL MPLR ENAB El4 E15 El6 El4 E13 MPR MPR MPR MPR MPR E9 E9 E9 E9 E9 LDACC B 10 24 FM15 E22 MPZ E20 11 12 13 23 21 19 FM15 LDACC 0 FM14 COUNTER CLR FM15 MPLR SEL 1 E22 E12 E1l2 MPZ MPR MPR E9 F 14 26 FM15 CRY E26 MCL 10 11 12 13 14 15 16 17 18 19 27 30 31 32 35 FM15 FM16 FM16 FM16 FM16 CRY CTL 2 MRMX SEL O MRMX SEL 1 MRMX SEL 2 E28 E59 E48 E48 LDFRMA E28 MAX E63 15 20 33 FM1l6é MSMX CTL 0 E59 MAX E63 16 17 18 21 22 23 39 28 29 FM16 FM16 FM16 PARITY EXAC LD 0 EXAC LD 1 "E26 E23 E23 MPZ MAX MAX 19 24 10 FM17 BEN E50 MSQ E20 E63 E63 E37 1A 1B 1C 1D 1E 25 26 27 28 29 01 03 12 00 36 FM18 JUMP 1 FM17 JUMP 3 FM17 BEN 3 FM17 JUMP 0 FM17 MUL SYNC E57 E57 E44 E53 E49 MSQ MSQ MSQ MSQ MPZ E37 E37 E37 E37 E20 1F 20 21 22 30 31 32 33 09 11 04 05 FM17 FM17 FM18 FM18 BEN O BEN 2 JUMP 4 JUMP 5 E44 E49 E53 E50 MSQ MSQ MSQ MSQ E37 E37 E37 E37 23 24 25 26 27 34 35 36 37 38 02 08 07 - 38 06 FM17 FB18 FM18 FM18 FM18 JUMP 2 JUMP 8 JUMP 7 SPARE 1 JUMP 6 ESS8 E58 E51 E40 ES51 MSQ MSQ MSQ MPZ MSQ E37 E37 E37 E20 E37 2-101 1 1 A - MCA C D E CTL A (12-Mar-85) Syndrome 1 A MSQ RAM Callout Name A E20 E9 E42 MCL EA42 MAX,MCL,E63,E42 MAX,MCL,E63,E42 MAX,MCL,E63,E42 MANUAL MACHINE 28 CHECK STACK 39 37 FRAME FM18 ANALYSIS SPARE 0 E40 - MPZ2 E20 e, 2-102 MANUAL MACHINE CHECK STACK FRAME ANALYSIS EHM ACTION: EHM VECTOR: Standard (See Introduction) CSL ACTION: None VMS Standard ACTION: A (See Introduction) 2-103 g, sy 7 P e, it e CHAPTER 3 SBIA MANUAL STACK FRAME ANALYSIS SBIA MANUAL STACK FRAME ANALYSIS OVERVIEW When an the VMS SBI Error Handler is called to process either an SBI or an SBI Error conditions it will generate Error Record generate an (Entry Code SBIA Error Checks. 13). The Record See the VMS MCHK Flow conditions under which this will The VMS System Spear, Error Handler will VMS Error certain MBox for Chart (2 happen. appended the The purpose to of this probable translate Chapter cause the of entry is SBIA and to help Error extract of 5) SBIA Event File ERRLOG.SYS. Thus, the record can be translated (See analyzed. most Alert, Fault, an will also Error Machine Fatal for Error the specific record to you analyze Entries. three To do and identify this you registers: will -~ Refered to SBIERR - Refered to as SBIA ER in the Translated Entry SBISTS - Refered to SBIA FS in the Translated Entry as IOA ES the using either ANALYZE/ERROR or Chapter 5, Example 4) and ERRSUM as SBIA Handler in the Translated the need Entry Match shown there the contents each register with the corresponding registers below. Underneath the Error Status bit in the registers below is a alphanumeric index (e.g., A-08). Use the index and Table 1 to identify the page in this Chapter that describes an error scenario that would produce a error record similar to the one you are analyzing. Review the senario. It describes the most that type of error will occur. probable cause of the error. common conditions under It also suggests the most which ERRSUM (IOA ES) ERRSUM EAROR SUMMARY REGISTER 2008 0008, 2208 0008 31 30 0 02 15 14 29 ; COMMAND . o1 00 13 12 27 26 LENGTHISTATUS | O 0 25 24 ‘ 23 22 B 21 B SBIA DETECTED AQS, BUFFER SBIA o MBOX DETECTED DETECTED A0S AQ7 I RROR L3 10 ‘ SBIA DETECTED ek 09 DETECTED A0S 18 | enaor | ennor 08 07 06 ‘ MBOX DETECTED S et ADPE___LCNTRLPE L ERROR NOTE: BITS <31:24> AND <22:19> AND <17:16> READ ONLY AS ZEROS 19 // BuFFER Sty | anTOL | ERROR DRTECTED P74 SB.A 20 T AO1 DMAC TRANSACTION | 28 AQ7 : O DETECTED __LADPE A0S AD2 A03 05 04 N B SBIA DETECTED ,CNTRLPE AD6 A4 03 ’ INTER- DETECTED| LOCK JERROR A07 SBIA 16 . e 02 01 DMAT TRANSACTION BUFFEF SBIA 00 MBOX DETECTEC DETECVED DETECTED A0S A0B AO7 | TIMEOUT 4 A/D AOB 17 | PE__ CNTRL PE , ERROF MR- 25124 SBIA MANUAL STACK FRAME ANALYSIS SBIERR (SBIA ER) SBIERR 2008 0034, 2208 0034 | 881 ERRON REGISTER TMEOUT | oy g0 NOTE: BITS <31:16> READ ONLY (UNDEFINED) BITS <15:13>, <9>, AND <07:00> READ ONLY AS ZEROS ERTE ! B B Table 1 Index of SBIA and SBI Error Scneraios Index Error Condition A-01 A-02 A-03 A-04 A-05 SBIA SBIA SBIA SBIA SBIA A-06 SBIA Detected ABus Control PE on DMA Read Data (Mask/Status) 3-13 SBIA SBIA SBIA SBIA Detected DMA ErrorsS . « o o s o ¢ o s o o o o o DMA Interlock Timemut TM TM . . . TM . TM TM . ¢ & o Detected SBI Timeout Error on CPU Reference . . Detected SBI Error Confirmation on CPU Reference o e « . o e . . o » & . - 3-14 3"""16 3-17 3-20 A-11 A-12 A-13 A-14 A-15 SBI SBI SBI SBI SBI Parity Fault . . &+ ¢ ¢ ¢ ¢ Write Sequence Fault . . « Unexpected Read Data Fault Interlock Sequence Fault . Multiple Transmitter Fault o o o o o o o o o o o o o o 3-22 3-24 3-26 3-28 3-30 A-07 A*Ofl A-09 A-10 i Detected Detected Detected Detected Detected Page ABus Address/Data PE on CPU Reference ABus Control PE on CPU Reference . « Address Error on CPU Reference . ¢« « State Machine PE .« ¢« ¢ ¢ o o o o o o ABus Data PE on DMA Read Data . « « « 3-3 ¢ ¢ . « . o ¢ « ¢« . o ¢ ¢ ¢ ¢« o ¢ ¢ ¢ ¢ o ¢ &+ ¢ ¢ o ¢ ¢ o ¢ o o ¢ o o ¢ o ¢ ¢ ¢ o o ¢ ¢ ¢ o o ¢ o ¢ . « ¢« o « & o o o o . ¢ ¢ o ¢ . o o o « . o o o & 3-4 3-6 3-8 3-10 3-12 SBIA MANUAL A-01 SBIA STACK FRAME Detected ABus OVERVIEW: When the ANALYSIS Address/Data MBox passes a on CPU CPU Read Reference Command/Address or a CPU Write Command/Address and Write Data over the ABus to be loaded into the CPU Buffer portion of the DC022 Register File on the SBA Module of the selected SBIA, it send odd parity to protect the Address during the cycle when the Command/Address is on the ABus and it sends odd parity (if the Command is a Write) to protect the Data during the following cycle when the Write Data is on the ABus. The Address and the Write Data timeshare the same ABus A/D lines. Parity for the Address is generated in the MBox on the MAP module. Parity for Write Data is produced in the MBox on the MCD Module which takes four original byte parity bits which had arrived from the EBox via WBus accompanying longword parity The SBIA first onto the Latch on the CPU Write Data bit. unloads File Info the Address Bus and and parity from where they are collapses them into bit from loaded into the the EBox. the the the the one DC022's File Data SBS Module for the check (the Address is then passed on to the Command/Address Latch). It then unloads the Write Data (if any) and performs another check in exactly the same way (the Write Data 1s then passed on to the Write Data Latch). A parity error in either case results in the SBIA aborting the CPU Command and returning ABus CPU BUF ERROR H along with ABus CPU BUF DONE H to the MBox which, in the turn, initiates an MBox Fatal Error micro trap in The SBIA latches either parity error in the same status bit, ERRSUM <22>. The two errors are distinguished from each other by ERRSUM <19> which sets only for errors detected on the Command/Address. This distinction must be made to determine which MBox source (MAP or MCD) produced the data/parity. It also latches the ABus Address in the SBI Timeout Address Note that error for Write which indicate status Register. Data Parity propagated a CPU Write Errors through Parity will Error the the CPU may MBox, in MSTAT1 have originated latching MBox <7:4>. 1If so, status an to frozen MBox be unable to record MSTAT2 <2>, CP IO BUF ERR, although an Error micro trap will still occur in the EBox, 1latching EBCS <15>, MBox FE. MSTAT2 <7>, Multiple Error, will also set, and MSTAT2 <20:16> will indicate the I/0 Adapter. | Fatal SIGNATURE: SBIA ERRSUM SBIA ERRSUM SBIA ERRSUM SBIA TOADR MSTAT2 EBCS <22> <19> <23> <27:00> = CPU A/D PTY ERR ERR on CPU C/A (Note 1) CPU BUF Error Lock (Note nuu ERROR <2> <15> ABus CP W MBox Address IO MBox BUF (A ERR FE 3-4 longword 2) | address) SBIA MANUAL STACK FRAME ANALYSIS Notes 1. If set 2. Locks the A/D PE was on the Address, ERRSUM <31:26> and the not the Write Data SBIA TOADR Register NOTE MEAR, MSTAT2 <20:16> (PAMM Code), and MSTAT1 <17:16> (Selected Adapter) obtained by EHM are not valid. However, before 1logging the error VMS overwrites MSTAT2 <17:16> with the correct Adapter code. PROBABLE CAUSE: Module Probability L0205/MAP 1 L0204 /MCD L0202/SBS L0203/SBA ABus/Terminator EHM ACTION: EHM, High High “High High Low entered (if ERRSUM <19> (if ERRSUM as the <19> is set) is result reset) of an MBox Fatal microtrap, rolls back the instructions, builds a stack vectors to the Machine Check Handler via SCBB+4. See EHM more detail. Error frame, flows and for VMS ACTION: After appending the SBIA ERRSUM Register to the Machine Check Stack Frame, setting EHMSTS <6> to note that it has done so, and logging the error, VMS checks the trapped instruction against a table of instructions capable of performing multiple I/O reads. If it gets a match, because some I/O Registers can be read only once (hence the instruction 1is not safe to retry), read/modify/write or if the instruction performs a and the error occurs on the write, VMS will abort the current image. Otherwise, because the failed reference was aborted by the SBIA before SBI or SBIA internal space could be affected, it REI's. Note that, if the processor mode at the time of the error was Kernal or Exec as it currently always should be, Image Abort becomes System Fatal Bugcheck. 3-5 SBIA MANUAL A-02 STACK FRAME SBIA Detected ABus OVERVIEW: When ANALYSIS Control the MBox passes PE On a CPU CPU Reference Read Command/Address or a CPU Write Command/Address and Write Data over the ABus to be loaded into the CPU Buffer portion of the DC022 Register File on the SBA Module of the selected SBIA, it send odd Control Parity to protect the Command/Length during the cycle when the Command/Address is on the ABus and it sends odd Control Parity protect the Mask/Status sent during the Data is on the ABus. the same ABus Control by the ABS MCA on the (if the Command is a Write) to following cycle when the Write The Command/Length and the Mask/Status timeshare lines. Control Parity is generated in the MBox MCC module. The SBIA first unloads the Command/Length and parity bit from the DC022's onto the File 1Info Bus from where they are loaded into the File Data Latch on the SBS Module for the check (the Command/Length is then passed on to the Command/Address Latch). It then unloads the Mask/Status (if any) and performs another check in exactly the same way (the Mask/Status is then passed on to the Write Data Latch). A parity error in either case results in the SBIA aborting the CPU Command and returning ABus CPU BUF ERROR H along with ABus CPU BUF DONE the H to the MBox which initiates an EBox. MBox Fatal Error micro tra in ’ The SBIA latches either parity error in the same status bit, ERRSUM <21>. The two errors are distinguished from each other by ERRSUM <19> which sets only for errors detected on the Command/Address. It also latches the ABus Command and Length in ERRSUM <31:26>.qq ERROR SIGNATURE: ERRSUM <21> ERRSUM <23> SBIA ERRSUM <31:28> SBIA ERRSUM SBIA ERRSUM <27:26> EBCS un CPU CNTRL CPU BUF ABus <02> <15> PTY ERR Error Lock (Note 1) Command Code Command/Length ABus <K19> o MSTATZ2 nu SBIA SBIA ERR ON CPU CP IO BUF MBox FE C/A (Note 2) ERR Notes l. Locks 2. If set the CNTRL Mask/Status. ERRSUM <31:26> and PE the was SBIA SBI on the TOADR Register Command/Length, NOTE MEAR, MSTAT2 (Selected However, MSTAT2 <20:16> Adapter) before <17:16> (PAMM 1logging with Code), obtained the the correct 3-6 by and EHM error Adapter MSTAT1l <17:16> are valid. VMS not overwrites Code. not the SBIA MANUAL STACK FRAME ANALYSIS PROBABLE CAUSE: Module | Probability L0220/L0230/MCC L0202/SBS L0203/SBA ABus/Terminator EHM ACTION: EHM, High High High Low entered as the result of an MBox Fatal Error microtrap, rolls back the instructions, builds a stack frame, vectors to the Machine Check Handler via SCBB+4. See EHM flows more detail. and for VMS ACTION: After appending the SBIA ERRSUM Register to the Machine Check Stack Frame, setting EHMSTS <6> to note that it has done so, and logging the error, VMS checks the trapped instruction against a table of instructions capable of performing multiple I/O Reads. If it gets a match, because some I/0 Registers can be Read only once (hence the instruction 1is not safe to retry), or if the instruction performs a Read/modify/write and the error occurs on the write, VMS will abort the current image. Otherwise, because the failed reference was aborted by the SBIA before SBI or SBIA internal space could be affected, it REI's. Note that, if the processor mode at the time of the error was Kernal or Exec as it currently always should be, Image Abort becomes System Fatal Bugcheck. SBIA MANUAL STACK FRAME ANALYSIS A-03 SBIA Detected Address Error on CPU Reference OVERVIEW: When a CPU Read or Write request arrives at the MBox, the physical address being referenced is used to address the PAMM (Physical Address Memory Map RAMs located on the MAP Module). If the resulting PAMM Code selects an ABus Adapter, the MBox takes the CPU Command/Address and any Write Data and sends them over the ABus to be loaded 1into the CPU Buffer portion of the DC022 Register File on the SBA Module of the selected SBIA. The SBIA unloads the Command/Address from the DC022's onto the File Info Bus from where it is loaded into the File Data Latch on the SBS Module for a parity check. The Command/Address is then passed on to the Command/Address Latch where the address 1is decoded. If the address is an Adapter Local Address (not for the SBI) but is referencing a nonexistent adapter register, then the SBIA aborts the CPU Command and returns ABus CPU BUF ERROR H along with ABus CPU BUF DONE H to the MBox which initiates an MBox Fatal Error micro trap in the EBox. | Note that the SBIA will also detect an Address Error on CPU SBI transactions when Control/Status <30>, reset. Control/Status <30> is normally set boot. The SBIA latches Address Error in ERRSUM ABus Address in its TOADR Register. ERROR initiated Enable SBI Cycles by the Console « | <20>. It also SBIA ERRSUM SBIA TOADR MSTAT2 EBCS <20> <23> uou ERRSUM <19> Wonowow ERRSUM SBIA <27:00> <02> <15> Adrs Err CPU BUF Error Lock (Note 1) ERR ON CPU C/A the ABus Address (A Longword Address) CP IO BUF ERR MBox FE Note 1. Locks ERRSUM <31:26> and the SBIA TOADR Register NOTE MEAR, MSTATZ2 <20:16> (Selected Adapter) However, MSTAT2 before <17:16> (PAMM Code), obtained by 1logging with the the and EHM error correct Adapter MSTAT1 <17:16> are valid. VMS not overwrites Code. is during latches SIGNATURE: SBIA Out, the SBIA MANUAL STACK FRAME ANALYSIS PROBABLE CAUSE: Module ~ Software L0202/SBS L0205/MAP L0203/SBA EHM ACTION: ~ ~ The Probability High Medium Low Low EHM, entered as the result of an MBox Fatal Error microtrap, rolls back the instructions, builds a stack frame, vectors to the Machine Check Handler via SCBB+4. See EHM flows more detail. and for VMS ACTION: After setting EHMSTS <05> to note that a full SBIA log is to follow, VMS 1logs the error and aborts the current image. Note that, if the processor mode at the time of the error was Kernal or "Exec as it currently always should be, Image Abort becomes System Fatal Bugcheck. SBIA MANUAL A-04 STACK FRAME SBIA Detected OVERVIEW: ANALYSIS State Machine A parity protected PE State Machine on the SBA module oversees all CP to I/O references which transmit to the SBI. It steps through the "states" of the SBI protocol: ARB for the SBI (and start the SBI Timeout Counter), transmit Command (and perhaps Write Data), await ACK(s), await Read Data (and restart Timeout Counter), ACK Read Data, and return to Idle. 1In addition to the usual SBI Reads is invoked for Interrupt Summary Read and Quad Clear, functions stepping through some different "states". If a State reference, CP and Writes, with all it four ' Machine parity error is detected during an active CP to I/0 the reference is aborted (an SBI Fault may result), and a IO BUFF ERR is reported to the MBox. But a parity error can also occur just as the reference finishes or while the State Machine is in the Idle State. To insure that these errors are also reported an SBIA interrupt 1is requested. Fatal Error microtrap ERROR SIGNATURE: SBIA ERRSUM MSTAT2 <02> EBCS <15> may occur which was w in parallel with caused by STATE MACH PTY ERR CP IO BUF ERR (Note MBox FE (Note 1) uu <18> (This request the CP 1) Note 1. If the error is detected during a CP to I/0 reference NOTE MEAR, MSTAT2 (Selected However, MSTAT2 PROBABLE <20:16> Adapter) before (PAMM 1logging <17:16> with the CAUSE: Module Probability L0203/SBA L0202/SBS High Low Code), obtained the correct by and EHM error adapter MSTAT1 <17:16> are valid. VMS not overwrites code. the MBox IO BUFF ERR). SBIA MANUAL STACK FRAME ANALYSIS EHM ACTION: Unless the error occurs during CP to I/0O Reference (which causes the usual Machine Check Stack Frame to be build because of the CP IO BUFF ERROR) the EHM t entered. Instead, the error is reported by an adapte: nterrupt. VMS ACTION: For Machine Checks, after appending the SBIA Error Summary Register to the Machine Check stack frame and setting EHMSTS <6> to note that it has done so, VMS logs the error and aborts the current image. Note that Image Abort, if the processor mode at the time of the error was Kernal or Exec as it currently always should be, becomes System Fatal Bugcheck. For errors reported by adapter interrupt, after making a full log of SBIA status Registers, VMS takes a System Fatal Bugcheck. 3-11 SBIA MANUAL STACK FRAME ANALYSIS A-05 SBIA Detected ABus Data PE on DMA Read Data OVERVIEW: The MBox provides odd parity for each longword of DMA Read Data that it loads over the ABus into the DC022 Reglster File on the SBA module. For Cache Reads this 1longword parity is produced by XORlng the four Cache byte parity bits. For Array Reads longword parity is generated by the ECC MCA. All such parity production occurs on the MCD module. For Cache Data Parity Errors and ECC Uncorrectable Array Errors the MBox provides an all zero's longword and a zero parity bit. A signal from the MCC module controls driving the longword and parity onto the ABus. v The SBIA unloads onto the File the Control Info Bus from Field there and the Parity Bit from the Control Field and Parity DC022's Bit are loaded into the File Data Latch on the SBS module for the check. When a parity error is detected, the SBIA generates good parity and sends the bad data with good parity along with Read Data Substitute (RDS) to the reading Nexus. The SBIA will post an interrupt requests and thus notify the CPU of the parity error. Depending on which one of the four DMA buffers in the DC022 Register 1is being used, the parity error is latched as one of four status in the SBIA Error Summary Register. This latchlng locks the entire error field associated with the active buffer in the Error File bits Summary Register. Registers (where It also locks the DMAx Command/Address and DMAx 1ID x correspond to s the active buffer). This preserves the SBI Command/Address associated with the error along with the SBI ID of the Nexus which issued that Command/Address. ERROR SIGNATURE: SBIA ERRSUM <14,10,6,2> = | PROBABLE - SBIA Detect Probability L0204/MCD L0203/SBA L0202/SBS ABus/Terminator High High High Low L0220/L0230/MCC EHM ACTION: ECC Instead, VMS the are ~ Except Errors ACTION: Retries (Which bit DC022 CAUSE: Module Array A/D is set depends on the Buffer being used.) VMS left Low for Cache Data (which cause a error is logs to reported the the - by error, Device Parity Machine an the adapIPL ter 1C clears Driver. 3-12 Errors Check) adapter and M Uncorrectable is not entered. nterrupt. status, and continues. SBIA MANUAL STACK FRAME ANALYSIS A-06 SBIA Detected ABus Control PE on DMA Read Data (Mask/Status) OVERVIEW: 1In response to a DMA Read the MBox always sends an ABus Control Field consisting of an all zero's Status Sub-field, an all zero's Mask Sub-field, and 0dd Parity along with each longword. The Control Field and the longword are loaded into the DC022 Register File on the SBA Module. Although the SBIA does not use the Control Field it does check the field for 0dd Parity. The ABS MCA on the MCC module generates the Control Parity. The SBIA unloads the Control Field and Parity Bit from the DC022's onto the File Info Bus from there the Control Field and Parity Bit are loaded into the File Data Latch on the SBS module for the check. When a parity error is detected, the SBIA generates good parity and sends the bad data with good parity along with Read Data Substitute (RDS) to the reading Nexus. The SBIA will post an 1nterrupt requeats and thus notify the CPU of the parity error. Dependlng on which one of the four DMA buffers in the DC022 Register File 1is being used, the parity error is latched as one of four status bits in the SBIA Error Summary Register. This latching locks the entire error field associated with the active buffer in the Error Summary Register. It also locks the DMAx Command/Address and DMAx 1ID Registers (where x corresponds to the active buffer). This preserves the SBI Command/Address associated with the error along with the SBI ID of the Nexus which issued that Command/Address. ERROR SIGNATURE: SBIA ERRSUM <13,9,5,1> = SBIA Detect Cntrl (Which bit is set depends on the dc022 Buffer being used.) PROBABLE CAUSE: Module Probability L0220/L0230/MCC L0203 /SBA L0202/SBS ABus/Terminator High High High Low | EHM ACTION: Instead, by an adapter VMS ACTION: Retries are VMS logs the error, left to the Device clears Driver. 3-13 adapter b the error status, is and reported continues. SBIA MANUAL A-07 STACK FRAME ANALYSIS SBIA Detected DMA Errors OVERVIEW: The SBIA latches (Command/Lenght, one of four status Address bits and NXM) (which bit depends on the active DC022 Buffer) in its Error Summary Register and requests an interrupt whenever the MBox replies with both MCC DMA ERROR H and "MCC DMA DONE [N] H in response to one of the SBIA’'s DMA requests. The MBox does this when it detects one of the following conditions. l. An ABus 2. An ABus Address/Data Parity Error on the 3. A "NXM" Control on the Parity Error on the "DMA Command/Length" "DMA Address" DMA Address The MBox aborts the bad DMA command, sets status error type, and requests an IPL 1D interrupt. dependent on the The latched status bit in the SBIA locks the entire error field in the Error Summary Register associated with the active buffer. It also locks the DMAx Command/Address and DMAx ID Registers (where x corresponds to the active buffer). This preserves the SBI Command/Address which got the error along with the SBI ID of the Nexus which issued that Command/Address. ERROR SIGNATURE: MSTAT1 <18> = ABus C/A Cycle MSTAT1 <20> = ABus Cntl (Note 1) MSTAT1 <19> MSTAT2 MSTAT1 <3> <29:26> SBIA ERRSUM PE (A Command/Length PE) U ABus (K12,8,4,0> Address PE (Note 1) NXM (Note 1) "8" ABus Cycle or "4" ABus Write Cycle MBox Detect (Note 2) Array Notes l. Three possible errors. 2. Which bit is setydepends on thé DC022 Buffer being used. PROBABLE CAUSE: Module Probablllty L.0202/SBS High L0203/SBA L0205/MAP L0220/L0230/MCC ABus/Terminator High High Low High - (If (If Address Parity Error) Command/Length Parity 3-14 Error) SBIA MANUAL STACK FRAME ANALYSIS EHM ACTION: The EHM which is entered as a result of the MBox interrupt, rolls back the instructions, builds a stack frame, and vectors to the Machine Check Handler via SCBB+4. See EHM flows for more detail. ~ VMS ACTION: vusfilbgs the error, clears adapter status (clearing adapter IPL 1C interrupt), and takes a System Fatal Bugcheck. the SBIA MANUAL STACK A-08 SBIA DMA OVERVIEW: FRAME ANALYSIS Interlock Timeout A DMA SBI Interlocked Read directed to VAX8600/8650 internal memory and accepted by an SBIA must be (ACK'ed) SBI Interlocked Write within 512 SBI cycles. not Lock happen Wire if accepted by the SBIA will detect an it is driving it (i.e. the MBox), latch status, Interlock if and followed by an If this does Timeout, drop the ABus 1Interlocked Read was the request an interrupt. P The latched status bit locks the entire error field in the Error Summary Register associated with the DMAI buffer (i.e. Error Summary <3:0>). It also locks the DMAI Command/Address and DMAI ID Registers to preserve the SBI Command/Address which got the error along with the SBI ID of the Nexus which issued that Command/Address. This timeout ABUS DMA can occur, DONE |[N] H for example, to the if adapter the MBox does signalling that not it return has the Interlock Read. The requesting Nexus will detect an SBI Timeout and won't issue the Interlock Write. ERROR Read Data SIGNATURE: SBIA ERRSUM PROBABLE <3> = DMA INTLK TMOUT CAUSE: Module SBI MCC completed Probability Nexus High L0202/SBS L0203/SBA L0220/L0230/MCC Medium Low Low EHM ACTION: The EHM is not entered. by IPL 1C adapter interrupt. VMS ACTION: the I/0 Driver. VMS logs the error and Instead, continues. 3-16 the error Retries is are reported left to SBIA MANUAL STACK FRAME ANALYSIS ‘A-09 SBIA Detected SBI Timeout Error on CPU Reference OVERVIEW: When a CPU Read or Write request arrives at the MBox, the physical address being referenced is used to address the PAMM (Physical Address Memory Map RAMs located on the MAP Module). If the resulting PAMM Code selects an ABus Adapter, the MBox takes the CPU Command/Address and any Write Data and sends them over the ABus to be loaded into the CPU Buffer portion of the DC022 Register File on the SBA Module of the selected SBIA. The SBIA unloads the Command/Address from the DC022's onto the File Info Bus from where it is loaded into the File Data Latch on the SBS Module for a parity check. The Command/Address is then passed on to the Command/Address Latch where the Address 1is decoded. If the decoded Address is not a local address (directed to an SBIA Register) or is a 1local address which allows, when accessed, an SBI function such as Interrupt Summary Read or Quad Clear to be performed, the SBIA State Machine is activated to perform the SBI protocol. The State Machine starts the 512 SBI cycle Timeout Counter and begins arbitrating for the SBI. If it cannot get it before the timer expires, it aborts the SBI function, latches status, and returns ABus CPU BUF ERROR H along with ABus CPU BUF DONE H to the MBox. If it is able to get the bus (the SBIA is usually assigned a high TR Level (low priority) for these CPU requests), it passes the ABus Command/Address (after translation into SBI format) along with the Mask during the first transmit cycle and any Write Data during a second transmit cycle onto the SBI. (For Quad Clear the SBI Command/Address is derived from the ABus Write Data while the SBI Mask derives from the ABus Mask; there will also be a second Write Data transmit cycle. For Interrupt Summary Read the SBIA itself produces the SBI Command: forced 0's.) The State Machine then waits for an SBI "ACK" (SBI CNF <1:0> = 01) from the addressed Nexus occurring two SBI cycles after each transmit cycle, signalling acceptance of the Command/Address And any Write Data. If it gets either a No Response (CNF <1:0> = 00) or a Busy (CNFf<1:0> = 10) Confirmation to any transmit, it then re-arbitrates for the SBI and transmits the whole thing all over again. If this occurs often enough for the Timer to expire (it is still running), the State Machine will abort the function, latch status, and return ABus CPU BUF ERROR H along with ABus CPU BUF DONE H to the MBox. 3-17 SBIA MANUAL STACK FRAME ANALYSIS If the function being performed is a Read, then after the Command/Address has been ACK'ed the State Machine will restart the Timer and await the Read Data. If the Data does not arrive before the Timer expires, the State Machine will abort, latch status, and return ABus CPU BUF ERROR H along initiating an MBox Fatal with ABus CPU Error micro trap Note that if the CPU attempts to access SBIA will detect this error. Therefore BUF DONE in the H to EBox. the - MBox, | a nonexistent SBI software can cause Nexus, this the error. The SBIA latches an SBI Timeout in SBIERR <12> and the Timeout type in SBIERR <11:10>. It the case of Quad Clear, its TOADR Register. also latches should the ABus indicate Address the SBI (which, address except in referenced) in <12> <11:10> <23> SBIA TOADR MSTAT2 EBCS <27:00> <02> <15> W - SBIA SBIERR SBIA SBIERR SBIA ERRSUM unwnu ERROR SIGNATURE: CPU Timeout CPU Timeout Status CPU BUF Error Lock the ABus Address CP IO BUF ERR MBox FE (Note 1) (Note 2) (A Longword Address) | : Notes 1. 00 = bus) 2. 01l= 3. 10= Waiting 4. Locks ERRSUM Register Device Device No Response Busy on on Last Access (includes couldn't get Last Access for Read Data <31:26>, SBIERR <12,8> and and MSTAT1 the SBIA TOADR NOTE MEAR, MSTAT2 (Selected However, MSTAT2 <20:16> Adapter) before (PAMM 1logging <17:16> with Code), obtained the by EHM error are VMS, thq‘qgrggct Adapter 3-18 <17:16> not valid. overwrites Code. S SBIA MANUAL STACK FRAME ANALYSIS PROBABLE CAUSE: Prababllity Madule Software L0202/SBS Nexus L0203/SBA L0220/L0230/MCC CPU | | High Medium Medium Low Low Low EHM ACTION: The EHM, entered as the result of an MBox Fatal Error microtrap, rolls back the instructions, builds a stack frame, and vectors to the Machine Check Handler via SCBB+4. See EHM flows for more detail. VMS ACTION: After setting EHMSTS <05> to note that a full SBIA log is to follow, VMS logs the error and aborts the current Image. (However, if VMS can determine that the error occurred on a read to a BRRVR in a DW780, it will REI instead). Note that, if the processor mode at the time of the error was Kernal or Exec as it currently always shnuld be, Image Abort becomes System Fatal Bugcheck. 3-19 SBIA MANUAL A-10 STACK FRAME SBIA Detected SBI ANALYSIS Error When a CPU Read or Write internal memory arrives Command/Address and sends Data Mask over the Confirmation request at the it ABus be CPU Reference directed beyond MBox, the MBox along with to on any Write loaded the VAX8600/8650 assembles an ABus Data and the Write into the CPU Buffer portion of the DC022 Register File on the SBA Module of the selected SBIA. The ABus Command is produced by the ABS MCA on the MCC module. The Write Data Mask is produced by the STAT MCA on the MCC Module from the context Mask to The and address of the ABS MCA for the CPU request; the STAT transmission to the ABus. MCA then passes this SBIA unloads the Command/Address from the DC022's onto the File from where it is loaded into the File Data Latch on the SBS Module for a parity check. The Command/Address is then passed on to the Command/Address Latch where the Address is decoded. The Write Info Data Bus and Mask are passed onto If decoded the similarly the Write Data Address is unloaded and parity checked and Latch., not a local address (directed are then an SBIA to Register) or 1is a local address which allows, when accessed, an SBI function such as Quad Clear to be performed, the SBIA State Machine is activated to pass the ABus Command/Address/Mask (after translating the Command into SBI format) during the first transmit cycle and any Write Data during the second transmit cycle onto the SBI. (For Quad Clear operations the SBI Command/Address is derived from the ABus Write Data while the SBI Mask, transmitted during both the C/A and the following Word Data cycle, derives from transmit cycle.) the ABus Mask; there is also The State Machine then waits for an SBI "ACK" (SBI from the addressed Nexus occurring two SBI Command/Address/Mask was transmitted, signalling Command. Nexus Error State the 1if Machine will ABus CPU initiating after Command Confirmation return MBox, However, decodes an (SBI to detecting be an CNF <1:0> then abort the BUF ERROR H MBox Fatal good illegal = SBI along Error SBI parity, instead function, with micro ABus trap second the it will of an latch CPU in Write CNF <1:0> = cycles after acceptance of function, 11) a BUF the 01) the the receiving return an "ACK". The status, and DONE H to the EBox. Note that SBI Nexuses return an Error Confirmation for other than a longword write to their internal Registers (i.e. the four bit SBI Mask transmitted with the Write Command was not all ones). A DW780 returns an byte access Error Confirmation of UNIBUS Space. for any attempt at other than Therefore software can cause a word or this error. The SBIA latches an SBI Error Confirmation in SBIERR <08>. It also latches the ABus Address (which, except in the case of Quad Clear, should indicate the Nexus referenced) in its TOADR Register. SBIA MANUAL STACK FRAME ANALYSIS ERROR SIGNATURE: SBIA SBIERR 08> = SBIA ERRSUM <23> = SBIA TOADR = MSTAT2 <02> = EBCS <15> = CP SBI Error Conf CPU BUF Error Lock (Note 1) the ABus Address (A Longword Address) CP IO BUF ERR MBox FE Note 1. Locks ERRSUM <31:26>, SBIERR Register. <12,08> and the SBIA TOADR NOTE MEAR, MSTAT2 <20:16> (PAMM Code), (Selected Adapter) obtained However, before 1logging the "MSTAT2 PROBABLE <17:16> with the and MSTAT1 <K17:16> by EHM are not valid. error VMS, overwrites correct Adapter Code. CAUSE: Module Probability Software L0202/SBS Nexus L0203/SBA L0220/L0230/MCC High Medium Medium Low Low CPU Low EHM ACTION: microtrap, The EHM, entered as the result of rolls back vectors to the Machine more detail. VMS ACTION: the instructions, Check Handler via an MBox Fatal builds a stack SCBB+4. See EHM Error frame, and flows for After setting EHMSTS <05> to note that a full SBIA log is to follow, VMS 1logs the error and aborts the current image. Note that, if the processor mode at the time of the error was Kernal or Exec as it currently always should be, Image Abort becomes System Fatal Bugcheck. 3-21 SBIA MANUAL STACK A-11 SBI Parity OVERVIEW: FRAME ANALYSIS Fault There is one SBI field, B <«<31:00>, protected with even parity by the SBI Pl parity bit. There is also a group of three SBI fields, TAG <3:0>, ID <4:0>, and M <3:0>, protected with even parity by the SBI PO parity bit. All Nexuses including the SBIA latch these fields and parity check them once every SBI cycle. SBI null cycles in which no Nexus drives the bus have all zeroes, good parity on the bus, by default. | When the SBIA detects bad parity (or the SBI Fault Wire is being driven), it latches status in the SBISTS Register, stops updating the SBI Silo (which contains a history of the 16 SBI cycles preceding the Fault), drives the SBI Fault Wire to insure that the SBI Fault Status remains latched in either the SBISTS Register or in the CNFGR Registers located in the other Nexuses, and requests an interrupt. If only one Nexus has its Parity Fault bit set, then the trouble is probably in a parity checker in that Nexus. If all Nexuses show a Parity Fault, then the trouble is probably in the Nexus which has Xmitr During Fault set (SBISTS <26> or CNFGR <26>). Also, the Silo (which will continue to record for a few more cycles after the Fault occurrs) should contain the TR Level of the Nexus which was transmitting on the SBI during the Fault. | ERROR SIGNATURE: SBIA SBISTS <31> SBIA <23> SBISTS SBIA SBISTS <22> SBIA <19> SBISTS SBIA SBISTS SBIA SBISTS <16> <265 I SBIA SBISTS <17> Parity Fault Pl PTY ERR (Note 1) PO PTY ERR (Note 1) Fault Latch (SBIA is driving the Fault Wire) (Note 2) | SBI Fault Signal (Live state of Fault Wire) Fault Silo Lock (A Fault has locked the Silo) Xmitr During Fault (Note 3) Nexuses CNFGR <31> = Parity Fault NEXUSES CNFGR <26> = Xmitr During SBIA SBI | " SILO <15:1> = | " Lowest (Note " = Fault order true 4) 15 more entries bit is active TR Level | iy SBIA MANUAL STACK FRAME ANALYSIS Notes Two different errors 2. Also ifidicates valid Fault status 3. Only set 4. The SBI PROBABLE if the SBIA was transmitting SBI Silo also records the SBI Tag,. ID, and Silo <31> indicates that within 16 cycles Fault Lock was cleared another fault occurred. Probability L0202/SBS SBI Nexus L0203/SBA SBI/Terminator High (if High Low Low (if transmitter) transmitter) not entered. EHM ACTION: IPL The EHM 1C adapter VMS ACTION: the fields. the Silo is reported CAUSE: Module by an Mask after is 1Instead, interrupt. VMS logs the error and continues. I/0 Driver. the error Retries are left up to SBIA MANUAL STACK A-12 FRAME ANALYSIS SBI Write Sequence OVERVIEW: <31:28> Fault SBI protocol states that: = 1101) or Interlock Command with a SBI Tag equal to an SBI Write Write Mask Masked (Reg Bus Command/Address (Reg <31:28> should be Bus = 0111) followed immediately by cycle that has the Tag equal to Write Data. Likewise, a cycle which indicates an SBI Extended Write Masked Command/Address should be followed immediately by two cycles of Write Data. This is SBI protocol. | , When an SBIA accepts a DMA Masked Write Command that is not followed immediately by the right number of Write Data cycles, it latches a Write Sequence Fault in the SBISTS Register, stops wupdating the SBI Silo (which contains a history of the 16 SBI cycles preceding the Fault), drives the SBI Fault Wire to insure that the SBI Fault Status remains latched in either the SBISTS Register or in the CNFGR Registers located in the other Nexuses, and requests an interrupt. Other SBI Nexuses on the Bus will detect the same Write Sequence Fault on Masked Writes if the command 1is directed toward their (Nexus) address space. | When an SBIA Nexus (other than the SBIA) detects this Fault it latches Write Sequence Fault 1in its CNFGR Register and drives the SBI Fault Wire for one cycle. Upon detecting this, the SBIA latches its SBISTS Register, stops updating the Silo, continuously drives the SBI Fault Wire to preserve Fault status in all Nexuses, and requests an interrupt. The Silo (which will continue to record for a few more the Fault occurs) should contain the TR Level of transmitting on the SBI during the Fault. <cycles Either the transmitter (see Xmitr During Fault Status) or the (the Nexus which detected the Fault) should be the problem. receiver SBIA SBISTS SBIA SBISTS <17> <16> NEXUSES NEXUSES CNFGR <30> CNFGR <26> SBIA SBI Silo " n n I <30> <26> <19> I SBIA SBISTS SBIA SBISTS SBIA SBISTS | SIGNATURE: <15:1> Wr Seq Fault (Note 1) Xmitr During Fault (Note 2) Fault Latch (SBIA is driving the Fault Wire) (Note 3) SBI Fault Signal (Live state of Fault Wire) Fault Silo Lock I ERROR after the Nexus which was Wr Seq Fault Xmitr During (A Fault has locked the Silo) Fault Lowest order true (Note 4) 15 more entries. bit is active TR Level. SBIA MANUAL STACK FRAME ANALYSIS Notes 1. If the SBIA detected 2. If tha SBIA caused the Fault in another Nexus. 3. Also indicates 4. The SBI Silo also records the SBI Tag, ID, and SBI Silo <31> Fault PROBABLE Lock was valid Fault indicates status. that within cleared another Mask fields. after the Silo is reported occurred. Probability L0202/SBS High Nexus High L0203/SBA Low EHM ACTION: The EHM is not entered. by adapter IPL VMS ACTION: the Fault 16 cycles CAUSE: Module | SBI the Fault. 1C VMS Instead, the error interrupt. logs the error and | continues. I/0Driver. 3-25 | Retries are | left to SBIA MANUAL STACK A-13 SBI FRAME Unexpected ANALYSIS Read Data Fault OVERVIEW: When a Nexus decodes a SBI Tag indicating Read Data and the SBI ID field matches it own ID (all with good SBI parity) but it is not expecting Read Data (it may be mistaken 1in that belief), it detects an Unexpected Read Data Fault. When an SBIA detects this Fault (the CPU's, 16) it 1latches Unexpected Register, stops updating the SBI ID that it compares Read Data Fault Silo (which contains for 1is the in the SBISTS a history of the 16 SBI cycles preceding the Fault), drives the SBI Fault Wire to insure that the SBI Fault Status remains latched in either the SBISTS Register or 1in the CNFGR Registers located in the other Nexuses, and requests an interrupt. Other SBI Nexuses on the Bus will detect the same Unexpected directed When an toward SBI Unexpected Read their Data Fault (Nexus) Nexus (other Read Data on address than the Fault in SBI SBIA) its Read Data if the command space. is S detects this Fault CNFGR Register and it drives latches the SBI Fault Wire for one cycle. Upon detecting this the SBIA latches SBISTS Register, stops updating the Silo, continuously drives the Fault Wire to preserve Fault status in all Nexuses, and requests its SBI an interrupt. The Silo (which will continue to record for a few more cycles after the Fault occurs) should contain the TR Level of the Nexus which was transmitting on the SBI during the Fault. The Nexus problem. which has its However the requested this Data. had done so. it it Xmitr During Fault bit set is detecting Nexus may have simply With luck, the Silo will have Note that if Read Data is returned after a Nexus expecting 1it, perhaps due to a Read Data Timeout, detect Unexpected Read Data Fault. ERROR probably forgotten the that recorded that is the no longer Nexus will SIGNATURE: SBIA SBISTS SBIA SBISTS SBIA SBISTS <29> <26> <19> SBIA SBISTS <17> SBIA <16> SBISTS Nexuses CNFGR <29> Nexuses CNFGR <26> Unexp Rd Dat Fault (Note 1) Xmitr During Fault (Note 2) Fault Latch (SBIA is driving the Fault Wire) - (Note 3) | SBI Fault Signal (Live state of Fault Wire) Fault Silo Lock (A Fault has locked the Silo) Unexp Xmitr Rd Dat During Fault Fault SBIA SBI Silo <15:1> = Lowest order true bit is active TR Level. (Note " " " = 4) 15 more entries SBIA MANUAL STACK FRAME ANALYSIS Notes l. If 2. If the SBIA caused the Fault in another Nexus. 3. Also indicatas valid Fault status. 4. The SBI Silo also records the SBI Tag, ID, and Mask fields. SBI Silo <31> indicates that within 16 cycles after the Silo Fault Lock was cleared another Fault occurred. PROBABLE the SBIA detected Fault. CAUSE: Module | Probability L0202/SBS High (1f the transmitter) SBI Nexus L0203/SBA High Low HM EHM ACTION: by adapter VMS ACTION: the the § (if is the transmitter) « not entered. interrupt. Instead, VMS logs the error and continues. I/0 Driver. 3-27 the error is Retries are reported | left to SBIA MANUAL STACK A-14 SBI FRAME Interlock OVERVIEW: When with good not proceeded Sequence Sequence the parity by Fault SBIA detects which a Fault ANALYSIS valid in a valid is directed SBI Interlock the SBISTS SBI toward Interlock Write internal memory) Read, it Register, latches stops (ie. an updating one that was Interlock the SBI Silo (which contains a history of the 16 SBI cycles preceding the Fault), drives the SBI Fault Wire to insure that the SBI Fault Status remains latched in either the SBISTS Register or in the CNFGR Registers located 1in the other Nexuses, and requ anests interrupt. Other SBI Nexuses which are capable of the Interlock function will detect the same Interlock directed toward When an SBI Interlock Sequence their Nexus (other Sequence Fault (Nexus) on than an Fault cycle. Interlock address in Upon Writes if this Fault space. SBIA) detects the its command it latches CNFGR Register and drives the detecting this the SBIA latches Fault Wire for one SBISTS Register, stops updating the Silo, Fault Wire to preserve Fault status in all interrupt. continuously Nexuses, and is SBI its driv the es SBI requests an | a The Silo (which will continue to record for a few more cycles ' after the Fault occurrs) should contain the TR Level exus which was of the transmitting on the SBI during the Fault. Either the transmitter (see Xmitr During Fault status) or the receiver (the Nexus which detected the Fault) should be the problem. Note, however, that the SBIA can also detect this Fault following a DMA Interlock ACK's an Timeout. The SBIA starts Interlock Read on the SBI. its Interlock Timer as soon as it If the MBox (which must arbitrate the SBIA request--and may "hang up" doing so) and the SBIA (which has its own overhead) return the Read Data to the requesting Nexus just in time to prevent the Nexus from detecting a Read Data Timeout, then the Nexus (after an unknown Interlock Write after (and is no amount the longer expecting of SBI arbitration SBIA has already it). time) timed out may get off an waiting for it ERROR SIGNATURE: SBIA SBISTS <28> Intlk - SBIA SBISTS <26> SBIA SBISTS Xmitr During <19> Fault Latch (Note 3) SBIA SBISTS <K17> SBIA <16> SBISTS Nexus CNFGR <28> Nexus CNFGR <26> Seq Fault (Note Fault (SBIA 1) (Note is 2) driving SBI Fault Signal (Live state Fault Silo Lock (A Fault has SBIA SBI Silo <15:1> Intlk Xmitr the Fault Wire) | of Fault Wire) locked the Silo) Seq Fault During Fault Lowest order true bit is active TR Level. (Note 15 more 4) entries SBIA MANUAL STACK FRAME ANALYSIS Notes 1. If the SBIA detected 2. If the SBIA caused the Fault. the Fault in another Nexus. Also indicates valid Fault status. 4. The SBI Silo also records the SBI Tag, ID, and SBI Silo Fault <31> Lock was indicates that within cleared another Mask fields. is reported 16 cycles after the Silo Fault occurred PROBABLE CAUSE: Module Probability L0202/SBS SBI Nexus L0203/SBA High High Low EHM ACTION: by adapter VMS ACTION: the The EHM is not entered. IPL 1C Instead, the error interrupt. | VMS logs the error and continues. I/0 Driver. 3-29 Retries ‘are left to SBIA MANUAL STACK FRAME A-15 Multiple Transmitter Each that SBI OVERVIEW: by the Nexus, cycle latched that it Data. in ANALYSIS a Fault Nexus its SBI transmits ID field, on the SBI, is read back and compared against the ID transmitted (its own ID except when it is ‘transmitting Read If the Nexus is transmitting Read Data the ID will be that of the receiving Nexus). A mismatch (which may simply be a mis-compare) is detected as a Multiple Transmitter Fault. This Fault may or may not be accompanied by a Parity Fault. When an SBIA detects this Fault, it latches a Multiple Transmitter Fault in the SBISTS Register, stops updating the SBI Silo (which contains a history of the 16 SBI cycles preceding the Fault), drives the SBI Fault Wire to insure that the SBI Fault Status remains latched in either the SBISTS Register or in the CNFGR Registers located in the other Nexuses, and requests an interrupt. The other SBI Nexuses can detect the same Multiple Transmitter Fault during their transmit cycles. When an SBI Nexus (other than the SBIA) detects this Fault it latches Multiple Transmitter Fault in its CNFGR Register and drives the SBI Fault Wire for one cycle. Upon detecting this the SBIA latches its SBISTS - Register, stops updating the Silo, Fault Wire to preserve Fault status in all interrupt. continuously Nexuses, Silo (which will continue to record for a few more Fault occurs) should contain the TR Level(s) of which was transmitting on the SBI during the Fault. If more than one Nexus has its Xmitr During Fault one with the highest (i.e. 1lowest priority) TR (assuming that the Silo reveals that the higher entitled to be on the bus at the time of the Nexus has its Xmitr During Fault bit set, then problem. | cycles after the Nexus (es) bit set, then the level is the problem priority Nexus was Fault). If only one that Nexus is SIGNATURE: <26> SBIA SBISTS <19> <27> SBIA SBISTS <17> SBIA <16> SBISTS = Nexuses CNFGR 27> Nexuses CNFGR <26> SBIA SBI Silo Multi Xmit Fault (Note 1) Xmitr During Fault (Note 2) Fault Latch (SBIA is driving the Fault Wire) (Note 3) SBI Fault Signal (Live state of Fault Wire) Fault <15:1> Silo Multi Xmitr " Lowest 15 Lock (A Fault has locked the is active TR Silo) Xmit Fault During Fault (Note " the . SBIA SBISTS SBIA SBISTS " the SBI requests an | The the ERROR drives and more order true 4) entries 3-30 bit Level SBIA MANUAL STACK FRAME ANALYSIS Notes l. '2‘ If the SBIA detected the Fault. If the SBIA was transmitting. 3. Also 4. The SBI Silo also records the SBI Tag, ID, and Mask fields. SBI Silo <31> indicates that within 16 cycles after the Silo Fault Lock was cleared another Fault occurred. indicates valid Fault status. PROBABLE CAUSE: Module Probability L0202/SBS SBI Nexus L0203/SBA High High Low EHM ACTION: - by adapter The IPL 1C VMS ACTION: VMS the I/0 Driver. EHM is not entered. Instead, interrupt. logs the error and continues. 3-31 the error Retries is are reported 1left to | b, ey W A SR . i N R A iy CHAPTER 4 CONSOLE MESSAGES CONSOLE MESSAGES Console Messages (Version 9.0) OVERVIEW This section describes console messages for Version 9.0 of the console software. (To determine which version of the console software, EDOBA, is running, execute the SHOW VERSION console command.) It begins with a description of the console message format followed by a list of tables. The tables identify the program or routine running at the time the message was printed and the type of message (error, warning, etc). Finally the tables 1list, in alphabetical order, all known messages and what they mean. In many cases include a statement explaining how you might CONSOLE MESSAGE FORMAT - Console Messages ?2CSM-F-ACCVIO ACV description t-————— > Message Text Message ID oe 1: . the three message. For example, also consist of four parts. e T T p— > includes will to the message. Condition S Part the respond Severity > Calling Code Routine bwgina with a racter name 1in > the of above question mark the routine message the Microcode (CSM) initiated the message print out. list of Console routines that display messages. O CSM - Console Support 0o DCN - General Console o DCP - Diagnostic o ECR - Error O EMM - Enviornmental 0 HEX - Hexadecimal o MCP - Macro Microcode Console Correction Routine Monitor Debugger Control Program Module that initiated Console The and the Support following is a CONSOLE MESSAGES 2: Severity Code - consists of a letter (E, F, in hyphens. The 1letter indicates the general condition. Part E W, or I) severity inclosed of the = Error - Indicates that the routine printing the message responding to a device or hardware error of some sort. = Fatal - Indlcates that the routine printing the message either detected an internal consistancy error (e.g., a bad parameter passed to a routine) or a totally unexpected or unservicable error (e.g., errors from RT). = Warning - Indicates that the routine printing the message able to complete some, but not all, of the operation. should check the result before proceeding. = Part 3: is was You Information - Indicates that the routine printing the message completed the operation successfully and that you should be aware of the information that follows the message header. Message ID - is a Message Text - six letter is the that explains the text of the message. It reason for the message. mnemonic that identifies the message. Part 4: line of text CONSOLE MESSAGE TABLES The defined and organized into the Console Messages following tables. Table 1 CSM - Console Support Microcode Table Table Table Table 2 3 4 5 (DCN) (DCN) (DCN) (DCN) General General General General Console Table 6 7 8 9 (DCP) (DCP) (DCP) (DCP) Diagnostic Table Table Table Table 10 (ECR) Error Correction Fatal have consists been Messages Error Messages Console Fatal Messages Console Information Messages Console Warning Messages Console Error Messages Diagnostic Console Fatal Messages Diagnostic Console Information Messages Diagnostic Console Warning Messages Table 11 (EMM) Table |12 (EMM) Enviornmental Enviornmental Table 13 (HEX) Hexadecimal Table 14 (MCP) Table Table Table 15 16 17 (MCP) (MCP) (MCP) Macro Macro Macro Macro Table 18 MCPSNP.LST Routine Error Messages Monitor Module Monitor Module Debugger Warning Control Control Control Control Error Messages Fatal Messages Messages Program Error Messages Program Fatal Messages Program Information Messages Program Warning Messages (KAF Snap Shot Routine) Messages of a further CONSOLE MESSAGES NOTE - Refer to the Console Software Specification for further information and specific details on console operation. TM Y TM g, A CONSOLE Table Header A S W S T . S 1 MESSAGES Console Support Microcode (CSM) Fatal Messages Message W S G T ?2CSM-F~-ACCVIO ACV Condition - The Console does not have part or all or all of the data requested. access to By raising the PSL CUR MODE Field Encoding and issuing the command again it may be possible to get a successful response. Also try using the physical address with memory management turned off. If the problem persists there may be a fault in the Access RAM. ?CSM-F-BADIPR Bad IPR Number - The IPR "Number" argument is currently unassigned. ?2CSM-F-BDCKSM Bad Packet Checksum - The computed checksum of the RBUF Packets does not match the Console checksum. This could be a Dual Port RAM, CBus or EBox Microcode problem. Try re-initializing the CPU. If that doesn't work run the MHC diagnostic. ?CSM-F-BDVADR Bad Virtual Address - The virtual address either has translation or a bad translation. ?CSM-F-CF64KB Can't Find 64KB - CSM was unable to find a good 64KB section of memory. There 1is a good chance that the MBox/Array either has a serious problem or it has not been initialized properly. ?2CSM-F-CFDRPB Can't Find RPB - CSM was unable to find the Restart Parameter Block (RPB) in memory. Either that section of memory was over written or the BBU power to support the Array Refresh 1is switched off or the 10 minute (BBU) Array Refresh limit-expired. In any case a cold re-start is necessary. , Note: This is a normal response the Restart Control Switch is Restart Halt position. ?2CSM-F-HERABT no if, after a power up, in the Restart Boot or Hardware Error Abort - Something went wrong during the execution of a CSM command. The Machine Check Stack Frame built by the EHM (ESC: <17:2F> should provide information about the error). If the error persists run the Micro-Diagnostics. CONSOLE MESSAGES Table 2 General Console (DCN) Error Messages Message ?DCN-E-BA11PF BAll Power Failure failure when it power at the BAll. ?DCN~-E-CSPERR "RAM ID" Control Store Parity Error - While in Console I/0 Mode a CS/DRAM Parity Error was detected in the RAM specified. Use the "VERIFY" Command (in Debug context) to determine the the "good"” "bad" and take the appropriate corrective action. - The Console detected a BAll read the status of the RL02. Power Check If the Console is in Program I/0 Mode the Control Store Parity Error 1is Uncorrectable. Following' this the Console will print the reason that the Error is Uncorrectable. | ?DCN-E-DVCERR Read/Write Disk Error - A Read/Write error was detected while reading or writing a file on the RL02. Examine ~the RLV12 Control and Status Registers for specific details. If error 1is unique to a specific file re-install the file from tape or, if necessary, replace the pack. ?2DCN-E-FULERR "device name" Device Full - The RL02 is full. This may be due to fragmented files. Some file may need to be deleted or the Pack may need to be rebuilt. gy CONSOLE MESSAGES Table 3 General Console Header AUV AR O S NI (DCN) Fatal Messages Message N A G N S S S DTSR RSN SARD same e W ———— - —————————eees e e sg g 44 g g8 0 AL R R e R R B B R R ?DCN-F~-ACCERR "filename" Illegal Access - The user does not have the privileges necessary to access the file specified. (e.g., the file may be Read only protected.) ?DCN-F-CHNERR Read/Write I/0 Channel ?DCN-F-EIARG Invalid Argument on "aaaaaa" Call - (Software Prablem) ?DCN-F-EOFERR Read/Write encountered Invalid - (Software Problem) End Of File the Console software an unexpected EOF while reading a file. There is something wrong with the file format. the file from tape. ?2DCN-F-INVPD ?DCN-F-TBLERR Invalid Parameter Detected - Table Build Fails - This error. Re-boot the (Software Problem) is a system. fatal ?DCN-F-TIMEXP Timer Either Expired or Not Set - ?DCN-F-TIMMAX Maximum Number of Timers Set - ?DCN-F-TRPERR Trap at PC xxxxxxx - This error. Re-boot the Console software (Software Problem) (Software Problem) is a fatal system. Restore Console software CONSOLE MESSAGES Table 4 General Header Console (DCN) Information Messages Message mmnmmmm”flmmwmmwmmmmmmmnmmmmmwmmwflw‘mmmmnmm”mummmw“fl ?DCN-I-CCABRT ?DCN-I-CLKNLK “C abort The user interrupted the command typed Control C (°C) which in progress. CPU Clock Frequency Unstable (Not Locked) - The System Clock 1is more than 60 nanoseconds out phase with the 1MHZ Reference. This message will only occur for Rev C5 Clock Modules. Either change to a more stable clock frequency or use the command "SET CLOCK FREQ QUIET" to suppress ?2DCN-I-CSLFAL the message. ” "ram file" Already Loaded already been 1loaded in the The file specified has Control Store or Dispatch RAM. ?DCN-I-RTYCON Remote Terminal Connected - Indicates that Terminal 1is connected to the will be echoed on both the Local ?2DCN-I-RTYDIS Remote the ?DCN-I-VENRNG KA860.REV/KA865.REV not found or invalid Either KAB60.REV/KA865.REV file does not exist or it was written. In either case the Console will not be the over to Teminal check for Disconnected Remote that Remote Terminal the Console. This message and Remote terminals. has connected - from proper microcode Indicates the Console. versions when it the able loads CS/DRAMs. As this is not a Fatal Error it will inhibit the Console from booting the system. ?2DCN-I-VERINC not "ram file" Ver. Incorrect, Should Have Ver. The CS/DRAM file just 1loaded is not compatable with the revision level of the system (as specified in the RLO2 file: KA860.REV/KA865.REV). This message will also be be displayed if you specify an incorrect file name. (E.g., typing the Console command "LOAD/ECS CSM040" will result in this message because CSM040 is a CSM Overlay and Microcode load the not considered load file. file but The you part of the Console will still, should beware of Main EBox however, erroneous responses. ?DCN-I-VFYERC "ram file" Verify Found "n" Microwords Bad During CS/DRAM verification (using the "VERIFY" command) "n" microwords were found to have errors. Note: the maximum error count displayed will be 256. If the errors persist then run the MHC Diagnostic to isolate the fault and then take the appropriate corrective action. See Note 1. o, CONSOLE MESSAGES Table 4 General Console (DCN) Information Messages (cont.) Header Message ?DCN-I-VFYERR "ram file" Verification Fails - If The Console is in "Degbug Mode" during CS/DRAM verification this message will be displayed followed by the Good/Bad Data for the first 256 errors. See Note 1. Note 1. The "VERIFY" command must be executed order to display the "Good/Bad" Data. >>>> VERIFY<cr> DC>»> VERIFY<cr> in Degbug context 1in or Otherwise only the total number of errors will be displayed. CONSOLE MESSAGES Table 5 General Console (DCN) Warning Messages Header Message ?DCN-W-CLKRER Command Invalid with CPU Clock Running must be stopped (i.e., STOP CPU) ?DCN-W-COMABT command to execute Command Procedure an error For was properly. Aborted detected troubleshooting - - The CPU 1in order Flag was Clock for the | The while Abort set executing a command purposes you can and file. - bypass this condition (and force the console to execute the command file anyway) by using the "SET ABORT OFF" command. See ?DCN-W-COMCWE ?2DCN-W-COMCWE execute the Nesting Depth files command ?DCN-W-CSLERR "ram file" contents Reg ID command file, CS of format of such as "LOAD/ECS different ?2DCN-W-EIRID below. Command Procedure Completed with Errors - The Abort on Error switch was not set when an error was detected while executing ?2DCN-W-COMERR Message but file. the The Console results are Exceeded - The maximum number (4) has been exceeded. Load/Verify the "ram file" Failure (file specified does ?DCN-W-EUREG ?DCN-W-EUSIG ?2DCN-W-EUSYM SDB_ID to of nested bad) not - The match the the CS/DRAM being loaded. Entering a command the following would cause this message MCF.BPN<cr> because the MCF RAM data has a format than the EBox CS. Undefined - The Register 1ID supplied as a command argument does not "SHOW REGISTERS" command to display registers and register IDs. ?DCN-W-EISID continued unpredictable. Not Found user supplied the CAD Tables. See Note in as CAD Tables a command - The that SDB_ID argument the exist. all does user Use the defined that not the exist in 1. Register Name Undefined - The Register Name that the user supplied as a command argument has not been defined. Use the "SHOW REGISTERS" command to display all defined registers and register IDs. | Signal Name Not Name that not See exist in Note 1. the Found user the in CAD Tables supplied CAD Tables. Symbol Name Not Found in the user supplied exist in the CAD Tables. See Note 1. that 4-10 as a - The command SDB Signal argument does CAD Tables The VS$ Symbol as a command argument does not TM CONSOLE Table 5 General Console (DCN) Warning Messages MESSAGES (cont.) Message Header ?DCN-W-FILNAM "aaaaaa" File Not Found - The file specified does exist on the RLO02. not ?DCN-W-INVCLK Command Invalid for this Version Clock Module'’ ?DCN-W-INVRMT Command Invalid from Remote Terminal - The last command ?DCN-W-INVXTL Unassigned Crystal Mnemonic' ?DCN-W-OPNERR "device name" Too Many Files Opened The maximum number of files that can be opened at the same time (4) | entered cannot be executed from (e.g., changing the Baud Rate.) has ?DCN-W-PARSER : ?DCN-W-PARSER ?DCN-W-USESTP a remote terminal. been exceeded. Ambiguous Command - Two or more commands command abbreviation. Be more specific. match the Invalid Command - Either the command was entered in the wrong context or the command does proper context and spelling. Use not exist. Check for INIT/POWER Command to Initialize EMM - Note l. Check the CDF860.DAT/CDF865;DAT file to be sure the Use right the CAD File names command: SHOW CONFIG/ASCII<cr> 4-11 it contains for the revision of your machine. CONSOLE MESSAGES Table 6 Diagnostic Console (DCP) Error Messages Header TN O N S D TEUN Message U R S WO e e e ?DCP-E-ALIVEE DSM Alive Failure - This message is similar to the one below, however, it will only occur if PASSES=0. Generally it indicates that you are running the micro diagnostics out of sequence. Execute Q@TSTCPU. ?DCP-E-ALIVEE DSM Alive Failure in Test xx - You should only get this message after you have issued a "START" command to a diagnostic. It means that the diagnostic should have finished running its current test, but has not. The microcode may be hung, or the test may have gotten into an infinite 1loop. All occurances of this failure should be reported to Diagnostic Engineering if they occur during the execution of @TSTCPU. ?DCP-E-BADCHK ?DCP-E-CONDER Bad Chksum - After 6 retries, the computed checksum still did not match the checksum sent with the DSM message packet. This could be a Dual Port RAM, CBus or EBox Microcode problem. Try re-initializing the CPU. If that doesn't work run the MHC diagnostic. Invalid Conditional Note 1 Statement failed to See ?DCP-E-DSMVRS ?DCP-E-EOTBLE ?DCP-E-ILLDSM Wrong Version DSM o, Loaded - End of Set Data Table Space - The user than 16 (Max) "Set Data" commands. Invalid See ?DCP-E-INVDAT of Note DSM isolate - Function Code has issued more - 1. Invalid SDB data, Note 1. failed to isolate - See ?DCP-E-INVDCB ?DCP-E-INVDCI Invalid .DCB See Note 1. isolation file - Invalid .DCI Note 1. isolation file - Invalid failed to isolate - See ?DCP-E-INVDID See ?2DCP-E-INVIND Note Invalid ID 1. Set Data 1Index 1Indicates that either an Isolation Routine or the associated .COM file was mis-read or over written. Re-boot the system and try again. If that fails then try a different pack. The pack you are using may have to be rebuilt. CONSOLE MESSAGES Table 6 Diagnostic Header AR GRS SR o R Console (DCP) Error Messages (cont.) Message N RN R N T 2?2DCP-E-NOANSD DSM-DC communication message, the EBox failure - to the Console. This could programming fault, because initialized properly or because Follow 2DCP-F-STSHIN the occur. bad. ?DCP-E-UFLTDP Note Invalid does - it This at Fault Detected Fatal this listening ?DCP-E-ALIVEE. message indicates Detected see happen because of a the hardware 1is not the hardware is broken. under Fault vyou stopped End that should the of Pass CPU Error DCI never File is - - 2 Unexpected See ?DCP-E-UMICTP Note it outlined has 2 Unexpected See ?2DCP-E-UMICTP If Unexpected See Note ?2DCP-E-UFLTDT procedure Stash Data Type When microsequencer Micro Trap Fatal, CPU Error - 2 Unexpected Micro Trap in Test xx at Vector xx You should get this message only after you have started a microdiagnostic. It means that there is something wrong in the hardware that is causing Microtraps in the EBOX that the current test has not requested or tried to force. If you see this message it means that a fault is in the machine that should have been caught by a previous diagnostic, or that the machine has not been initialized properly. What you should do: 1. Enable HARDCOPY available 2. Type "SHOW Switches" 3. Type "SHOW Data" 4. Type "Examine/WBUS 6" 5. Type "Examine/WBUS 7" 6. Type "Examine/WBUS 9" 7. Type "Examine/WBUS 11" 8. Type "Examine/WBUS 12" 9. Type "Examine/WBUS 13" if you have a hardcopy terminal CONSOLE MESSAGES Table 6 Diagnostic Console Header D A O I S N AR (DCP) Error Messages (cont.) Message A S o o TR A S 10. D A N A S A U G S i S A i . S R S W S W S WD W SR TGN SR DN T setup problem in the test ?DCP-E-UNPEOP Unexpected See Note 2 Pause at End of Pass ?DCP-E-UNPEOT Unexpected Pause at End of Test ?DCP-E-UNXEOD Note TR SR WG WS R N S W BN S R AN W O N N A Note microcode. 2 Unexpected See WS SN Type "START" - This will cause the tests to run again. It will indicate whether the the problem was a spurious one time event, or a initialization or See W End of Dispatch Table 2 ?DCP-E-VERNDM Version Numbers Do Not Match - Note l. This is most likely a software problem (DSM, DC, or the isolation file). Check to make sure that the CPU revision level matches the revision level in KA860.REV/KA865.REV on the RLO0O2. 2. These messages should never occur unless you are micro stepping a diagnostic. If they occur during micro stepping they should be considered as informational Messages. That is, the Console Software is keeping you informed about the actions of DSM. Iy i CONSOLE Table 7 Diagnostic Console (DCP) MESSAGES Fatal Messages Header Message ?DCP-F-LDFAIL DSM Load Fallure - DC was either unable to load DSM or unable to start DSM. Re-initalize the CPU and DC. 1If the problem persists run the Micro-Hard-Core Diagnostic. CONSOLE MESSAGES Table Diagnostic 8 Header AN S N W D R Console (DCP) Information Messages Message W O S B A G mwmmmmummwwmwmw“u“mwmm“wmmmMwwmm*wwuwwwmfimmumwmmmm“ ?DCP-I-BADMWD Stop on uMark Bit - A micromark running diagnostics. ?DCP-I-FAPAUS Fault Detected, Pausing... /Fault:pause switch and the fault. ?DCP-I-PAUSEI Pausing... Fault ?2DCP-I-SWDATA Set Switch Data The user diagnostic Either the user typed Control is Command "Set Data" command. bit was detected set Not to "NOABORT" Invoked - prior to command 4-16 or The set the detected P (“P) "PAUSE". user while must entering a a or the enter a "Show Data" CONSOLE MESSAGES Table 9 Diagnostic Console Message Header mmmmm L (DCP) Warning Messages U L ?2DCP-W-CLKSTP Clock Not Runnlng - The CPU Clock is stopped and be started for the command to execute properly. ?DCP-W-DIASTA Diags Not Started - The user 1issued a "Continue" "Step" command before starting the diagnostic. ?2DCP-W-LMTRUN Limit of Set Data ASCII Text, exceeded 30 (Max) ASCII Truncate - The characters user must or has in a string. ?DCP-W-FAILIS Isolation Algorithm Failed to 1Isolate There 1is a problem in the isolation file such that DC is unable execute the isolation algorithm. ?DCP-W-ISOEOF End of Isolation File Encountered DC unexpectedly encountered an EOF in an Isolation File. There is something wrong with the Isolation File. CONSOLE Table 10 MESSAGES Error Correction Header Message ?ECR-E-INTERR CSPE Routine Interrupt, (ECR) Code Error Messages Invalid (0) - The Console was interrupted to correct a CS/DRAM Parity Error, but when the interrupt code was read it was zero which is unassigned. ' i i ‘ LO07-CL09 ?ECR-E-MBTERR and EBE3) "ram id" Multi-Bit-Error, Uncorrectable - The syndrome that was calculated for RAM parity error did not identify a single bit in error. Therefore, the Console software assumes that the RAM had a multiple bit error and prints this message. This will result in a KAF. See Note 1. ?ECR-E-MUNREC ?ECR-E-NOECCD MCS MBox Not Recoverable - MBox single bit CS Parity Errors are correctable but not recoverable (see Chapter 2 - MBox Control Store Correction). This will result in a KAF. "ram id" | No ECC Data in correct the CS/DRAM corresponding ECC the o o o ?ECR-E-PCFAIL typed during the ECC Tables The CS/DRAM address that was nonexistent. "ram id" this specified after See Note 1. Error will Correction unable "ram id" syndrome are not the Attempted correct 5 attempts. Syndrome generated refreshed result to > as ECC data needed specified was not will occur if: INIT routine before loaded. The was ?ECR-E-SYNGTR This were cases The Table. ~C was all - Error Tables SDB) In Table Parity on a Console in a KAF. and Failed the This CS/DRAM will the to in ECC reboot. read (via - The Console Parity result the Error in a KAF. RAM Size, Uncorrectable a result of the CS/DRAM - The Parity specified indicates that a bit beyond the size of the RAM was at fault. (e.g., bit 22 in the IBox DRAM which is only 20 bits wide.) This will result in a KAF. See ?ECR-E-SYNZRO "ram Note id" 1. Syndrome = 0, Transient, Continuing - Indicates that the VAX8600/8650 detected a transient error associated with the CS/DRAM specified. That is, a parity error was detected and 1latched in the appropriate CS/DRAM logic, but when the data latch was read (or the CS RAM re-read) the data was OK. There is, CONSOLE Table 10 Error Correction Header Routine (ECR) Error Messages MESSAGES (cont.) Message however, a occurred The ?ECR-E-UPCERR very low possibility such that console will no further action "ram id"TM Can't it report will Read that multiple produced this as a a real transient 1likely this in the addressing is either logic an SDB error specified. and CS/DRAM the micro-address result in a KAF. failure or with the associated error syndrome. be taken by the console. Box Address The correction routine was unable to read associated with the error. This will Most bit zero a failure CS/DRAM Note The Console and Address" Software for reports "Good uncorrectable Data/Bad CS/DRAM ‘Data, Parity Syndrome, Errors. CONSOLE Table 11 MESSAGES Environmental Header 2EMM-E-EMMACK ?EMH~E~EMMACL Monitor Module (EMM) Error Messages Message No TRANSPORT ACK EMM_LAT AC_LO from Failed EMM to Deassert ?EMM-E-EMMAFA AIR FLOW FAULT PENDING <SHUTDOWN ?EMM-E-EMMAFA AIR FLOW PENDING <CAN'T ?2EMM-E-EMMAFD REGULATOR_A OK Failed to Deassert ?EMM-E-EMMAFF AIR to ?EMM-E-EMMANO MODULE ?EMM-E-EMMAOK REGULATOR_A ?EMM-E-EMMBFD REGULATOR B OK Failed ?EMM-E-EMMBOK REGULATOR B ?EMM-E-EMMBUF EMM has ?2EMM-E-EMMCAC CONCOLE_CPU AC ?EMM-E-EMMCDC CONCOLE_DC LOW ?2EMM-E-EMMCFD FAULT FLOW Fault A Not No Failed OK is is on EMM IMMINENT> POWER UP> Deassert Power-up Not OK Not to Deassert OK Protocol LOW Buffer FAILED TO DEASSERT FAILED TO DEASSERT REGULATOR C OK Failed Deassert ?2EMM-E-EMMCOK REGULATOR C is Not ?EMM-E-EMMCOL Data (Collisions) ?EMM-E-EMMDCL EMM_LAT DC_LO ?2EMM-E-EMMDED Console/EMM Communication Temporarily Suspended Communication with the EMM has been suspended for Errors seconds due OK Failed 1link stating on EMM Bus to Deassert to excessive communication the message to errors with the EMM or the 30 EMM (these errors are reported prior to that communications is suspended). While EMM communication is suspended the ALERT led will flash double-time (1/4 sec. on, 1/4 sec. off). After 30 seconds the console tries to re-establish communication with the EMM, however, failure to do so will not again be reported (but the led will continue to flash until the link is re-established). ?EMM-E-EMMDFD REGULATOR D OK Failed to Deassert ?EMM-E-EMMDOK REGULATOR D is Not OK ?EMM-E-EMMEFD REGULATOR_E_OK Failed 4-20 to Deassert CONSOLE MESSAGES Table 11 Environmental Monitor Module (EMM) Error Messages (cont.) Header Message 2EMM—-E-EMMEOK REGULATOR _E ?EMM-E-EMMFFD REGULATOR F_OK Failed to Deassert ?EMM-E-EMMFOK REGULATOR F is Not OK ?EMM-E-EMMHFD REGULATOR H OK Failed ?EMM-E-EMMHOK REGULATOR_H Not OK ?EMM-E-EMMI5S5 EMM 5.5 ?EMM-E-EMMI65 EMM Encountered 2EMM-E-EMMINV Invalid Exception Code from EMM ?EMM-E-EMMJFD REGULATOR J OK Failed ?EMM-E-EMMJOK REGULATOR J is Not ?EMM-E-EMMKAC MOD K AC_LO is Asserted ?EMM-E-EMMKOK REGULATOR K is ?EMM-E-EMMLAC MOD L AC LO is Asserted ?2EMM-E-EMMLOK REGULATOR L is ?EMM-E-EMMNEG EMM Re jected Command Request Due to Present Conditions ?EMM-E-EMMPER EMM ?EMM-E-EMMRES No Response ?EMM-E~-EMMRST EMM Encountered Restart ?EMM-E-EMMRZA RED ZONE ?EMM-E-EMMRZA RED ZONE FAULT PENDING <CAN'T POWER UP)» ?EMM-E-EMMTRP EMM Encountered Unexpected Trap Interrupt ?EMM-E-EMMUNK EMM ?2EMM-E-EMMURC Unknown Restart is is to Deassert Interrupt Broken -Unexpected replace 6.5 EMM Interrupt | to Deassert OK Not OK Not OK Encountered RAM Parity Error from EMM FAULT PENDING Encountered 1 <SHUTDOWN Unexpected Code Instruction Trap in RTDREG IMMINENT> to PC 0 CONSOLE Table 12 MESSAGES Environmental Monitor Module Header Message ?EMM~-F-EMMMTL EMM Protocol (EMM) Fatal Messages ~ . | Message Too Long ?EMM-F-EMMXTO Console-to-EMM Protocol Message Transmit Timeout Either the switch controlling BBU power to the TOY with the (Software TM EMM Transmission Already problem Progress Problem) ?EMM-F-EMMTIP off or the is a Console module. in (Software TOY chip Problem) on a is the o . . CONSOLE Table 13 Hexadecimal Header I A T AN CTRER I Debugger (HEX) MESSAGES Warning Messages Message N SN A S R SN SR ?HEX-W-ADRFOR 2HEX-W-ANFERR Address Field Out of Range specified exceeds the size of the See Note 1. CS The or micro-address DRAM. Microaddress Not Found in File The micro-address specified does not exist in the RAM File. (e.g., CSM is not included as part of the EBox.BPN file.) | See Note 1. ?HEX-W-CLKRUN Command Invalid with CPU Clock Running - The CPU must be stopped ("STOP CPU") before the command executed properly. See Note 1. ?HEX-W-DATFOR Data Field Out of Range - The the size of the CS or DRAM. ?HEX-W-INCHNO Invalid Channel specified does See Note 1. 2HEX-W-INVRPT ?HEX-W-INVSID not Number - exist. data The specified exceeds Control Channel SDB Invalid Repeat Function executed with the ”Repeat" See Note 1. The command function. Invalid ID or Name last entered does not - files are loaded for CDF860.DAT/ASCII" or See Note The exist. SDB Make clock can be ID sure cannot or the SDB Name correct the CPU revision 1level "SHOW CDF865.DAT/ASCII). be CDF ("SHOW 1. Note 1. Use the Console Console Software Software Spec "HELP" for more command details on or refer to command usage. the CONSOLE MESSAGES Table 14 Macro Control Program (MCP) Error Messages Message ?MCP-E-C40ICE CSM040 Hung During CPU Initialization - CSM was wunable to properly initialize 1itself. This indicates that there is an interaction problem between the EBox and one of the other boxes. If the problem persists run MHC and the all else Overlay. ?MCP-E-CBADCK CSM Sent Microdiagnostics fails Bad try to single Checksum - The isolate the stepping computed fault. If through the CSM checksum did not match the checksum sent with the last CSM message. This could be a Dual Port RAM, CBus or EBox Microcode problem. work run ?MCP-E-CSMLOP Note No Acknowledgment If CSM CSM Data Sent No CSM Sent extra ?MCP-E-NORPKT No Note See Packet Packet - CSM EBox 1is either condition either the INIT/CPU if failed CSM to respond - The falled Data command to Packet Data send - Packet a command. protocol a includes data packet. CSM when to a INIT/CPU command trying to execute Received trying from CSM to perform Non-data Response 1. ?MCP-E-UNEXPD CSM Sent See Note Note doesn't either a Data sent an Packet was 1is most the command protocol. CSM Pkt2, Cntl See Note 2. Unknown that 2. ?MCP-E-PK2CNS See hung unexpected Response Note is but likely hung initialization. ?MCP-E-UNKCSM If occurred during a Unexpected or not part of See from CSM this chances are See Note 1. a data packet See Note 1. ?MCP-E-NODATW CPU. 1. request. ?MCP-E-NODATA the CSMs Console Loop Not Running"” The hung or not started. To clear this start the CPU (if “P was typed) or use the EBox is hung. See ?MCP-E-NOCSMR Try re-initializing the MHC diagnostic. Data Not for Set 2. CSM 2. Packet Code - some CSM kind of an CONSOLE MESSAGES Table 14 fiacro Control Program (MCP) Error Messages (cont.) Header A O W GO T O R N W S R | Message I O ?MCP-E-XCCHKS O MBI M AR N O A O S N RN N O R I R . O G . RN RS W AR O S R R I R N S SN S N R I N N O R A R S R A . X Command Cmd_Check_Sum Failure The Checksum calculated by MCP didnot match the Checksum associated with the See Note ?MCP-E-XDCHKS R R X command portion of 3. Command Data_Check_Sum the X Command. Failure - The Checksum calculated by MCP did not match the Checksum associated with the data portlon of the X Command. See Note 3. ?MCP-E-XRTIMO X Command Receiver Time_out - Once the X Command has established a 1link the console expects a byte no less than once a second. This message indicates that the X Command has not send a byte during the last second. See Note 3. Note l. To determine where CSM is Hung: a. Type b. Type "MIC" - This will PCs to be typed on the c. Type "space bar" 10 more times This sequence of Microsequencer PCs to be helps us to find out what the CPU thinks d. Typev“return" - This gets you out of MIC mode. e. Type f. Type "@STKFRM" - The EHM may have built a Stack dump g. Refer "STOP CPU" cause the terminal. current Microsequencer causes a typed out. its doing. whole This "UNHANG" Frame ESC:17 to for through the Overlay was this Console error condition. Machine This Check command will 2F. Software Spec to determine which CSM loaded. 2. These are CSM protocol problems. First try EBox Microcode. If that doesn't work Diagnostics 3. The X Command is used to down line load files from a host system to the TI11. It was intended for use during the engineering debug phase of the project. It is not intended to be used by the FIELD. re-loading the run the Micro CONSOLE MESSAGES Table 15 Macro Control Program Header Message A ?MCP-F-ABSDED N A N SO SN R O W SN S S W N (MCP) SR WO A R A T Fatal Messages NS N R N . SN S N S R RS R O S S S SR A WD AN S A W N A N A A ABus Dead - Indicates in Program I/0 Mode. just like a that ABus Dead was detected while An ABus Dead condition 1s treated Power Fail condition. ?MCP-F-INVCSM Invalid CSM Overlay Number (Software Problem) ?MCP-F~-PWRFAI AC Power Failure - Indicates that an AC Input Unit (H7170) detected a loss of AC Power and sent the signal AC LO to the EMM. The EMM in turn sends EMM3 CPU AC LO to the I/0 Adapters (SBIAs) and the Console. The I/0O Adapters in turn generate SBAQ SBI FAIL which notifies the Nexus of the 1loss of AC power. This allows the Nexus to execute an orderly power fail shut down. The Console, after receiving EMM3 CPU AC LO, sends CLO9 CPU PF INTER to the VAX8600/8650. Then both the Console and the VAX8600/8650 begin executing the power down sequence. After loosing AC power the system has anproxlmately 10 milliseconds before the DC power becomes unstable. This allows the system to sweep cache and save the state of the CPU (i,e., GPRs etc.). If battery back up is enabled the system will be able to preserve the state of the arrays for approximately 10 minutes. Thus the system will be able to recover from short term losses of AC power. ?MCP-F-PWRFAL Power Fail - Indicates that a Power Fail detected while in Program I/0 Mode. condition was wwwwwwwww CONSOLE Table 16 Macro Control Program (MCP) Header Message ?MCP-I-BBUINV Battery Backup Unit L B MESSAGES Information Messages B A Invalid DRSO — - ?MCP-I-CPSRUN CPU is Still Running The early versions of Console software would automatically halt the the VAX8600/8650 when “P was typed. On later versions this was changed. The Console software no longer halts the VAX8600/8650 when “P is typed. Instead it prints this informational message. You must type "HALT CPU" to stop the VAX8600/VAX8650. . ?MCP-1I-HDECOL Hardware Not Up to Proper ECO Level - According to the System ID Register (SID) and KA860.REV/KA865.REV the files on the RLO2 are not compatable with the system. Check KA860.REV/KAB65.REV against the system ID. 2MCP~-1I-MCLDST Aborting Redundant Cold-Start Attempt - A Cold Restart has been attempted and failed. Second and subsequent automatic Cold Restarts are aborted. ?2MCP-I-MWRMST Aborting Redundant Warm-Start Attempt - A Warm Restart has been attempted and failed. A second Warm Restart attempt will be aborted and a Cold Restart will be attempted if enabled by the via the System Control Panel). ?MCP-I-LARPWR Lost Array Refresh Power (warm start Either the BBU 1is switched off or limit for not possible) the 10 minute time the Battery Back Up Unit to supply power to the array refresh circuit expired. A warm re-start is no longer possible. A Cold Restart will be attempted. If enabled via the System Control Panel. ?MCP-I-NOPAMM Command ?MCP-I-RPBBSY RPB Invalid, PAMM Not Restart-in-Progress Init'd, Flag Set Do INIT/PAMM - The warm restart attempt failed. If the Restart Control Switch is in the correct position (RESTART BOOT) a cold restart will be attempted. 2MCP-I-RPBINV RPB Invalid/Not Found - The RPB in memory is not valid. The warm restart must be aborted. If the Restart Control Switch is in the correct position (RESTART BOOT) a cold restart will be attempted. CONSOLE Table MESSAGES 17 Macro Control Header R N S O S N Program (MCP) Warning Messages Message WO M A S S R o ?MCP-W-ADROOR Address the See ?MCP-W-CPHUNG Out of Range - The address specified the range for Examine/Deposit Commands. Note 1. is outside CPU is Hung 1Indicates that the VAX8600/8650 has stopped running for some reason. Usually this message is either proceeded by or followed by a message that explains the reason that the CPU is Hung. For example this message will almost always precede a Keep Alive Fail condition when the Snap File mechanism is enabled. The only exception is when the CPU directs the Console (via a CSM Code of 1B) to Halt the system. There are a number of cases, however, when a second message explaining the reason for the HUNG CPU will not be printed. For example, if the Snap File mechanism is disabled when a KAF condition occurs no additional messages will be printed. The same kind of situation would occur 1if a WBus Parity Error was detected while the Console was in Console I/O mode. One way single 1. 2. to approach step (MIC) One of a. following UPC 20 and B - Indicates RAMs b. UPC 21 - c. UPC 24 - ?2MCP-W-INVIPR EBox to stop the CPU and for: UPCs: a parity error in Indicates a double error condition Indicates a WBus both the A Parity Error tight microcode loop. Generally the microcode is looping waiting for some event to occur. Look in the listings to determine what the event is. Then to determine why the event did not occur. An EBox or IBox Stall condition. Examine the SDB ESTALL and ISTALL Registers. If a stall condition exists try to determine if it is the cause of the HUNG condition. Data Out See note Invalid of Note Range - The Data specified 1. IPR Number currently See is Look A CPU ?MCP-W-DATOOR problem EBox. the try 3. this the - unassigned. 1. 4-28 The IPR exceeds "Number" 32 bits. argument is TM CONSOLE Table 17 Macro Control Program Header TN AR W A SN O (MCP) Warning Messages MESSAGES (cont.) Message A S RN W R ?MCP-W-INVRRW Read or Write Not Allowed - The command 'SD@leled is not appropriate because the IPR is either Read or Write only. See Note 1. cannot be" Use the Console Software "HELP" command or refer to Console Software Spec for more details on command usage. the 2MCP-W-XINCOM X Command in .COM File Invalid - X Commands executed in a command file. See Note 1. Note 1. CONSOLE Table MESSAGES 18 MCPSNP Messages (KAF Snap Shot Routine) mmu-mmmwm“mwnwmmmwflflmmmmwmflwmmm“mummmmnflm-ausmmmmmmmflmmflmmu*mmmmwmwwmmm The KAF Routine prints the following message. "Attempting DOUBLE The ESCRATCH PARITY Error Handling parity likely banner | to save machine ERROR followed by a reason | state due to:" ' Rea Code: son 18) =~ Microcode (EHM) error 1in both copies this condition occurred determined that there was a of the EBox Scratch Pad RAMs. Most when the EHM was attempting to correct a GPR parity error by copying the good GPR to the bad GPR. This is a non-recoverable error condition. The EHM responded to this error condition by 1looping at EBox UPC 20 which in turn resulted in this KAF. This is a sticky problem. There is a good chance that the GPR parity error will once again be detected when the KAF routine uses CSM to read the EBox Scratch Pad RAMs. If that is the case then the EScratch Record of the SNAP file will contain the contents of the EScratch up to the point where the parity error occurred. The remainder of that section will contain all ones (FFF...). In addition to being unable to copy some or all of the EScratch, the KAF routine will be unable to copy the CPU IPRs, the PAMM, the top 64 longwords on the Interrupt Stack, and the SBIA/Nexus Registers. Probable Cause: Module Probability RAMs AN G . W R P U R OIS RN G L0209/EDP High L0219/EBE Low AR ORGSR G N E500, MACHINE DOUBLE ERROR - condition by R D IR R E501, oIRGB UEREE SOMEN WHBD RN E502, SR NI SRR GRNRE RN E503 (KAF Rea Code: son 19) ' The Error Handling Microcode error when a second EBox non-recoverable (GPRs) ARER SR o0 GRS error looping (EHM) was in the process of handling an related error was detected. This is a condition. The at EBox UPC 21 which, EHM responded in turn to this resulted error in this KAF. Approach: file. Contact Meanwhile, the RDC and request Otherwise type "SHOW SNAP1l.DAT and The ESC should contain a partial first error. However, beware that EScratch locations that they analyze if possible use VSRBLD to translate with status ESC: 12 ESC: ESC: 15 19 (Trap Vector) (EBCS) (VMQ) ESC: 2F (PSL) from 4-30 the SNAP the SNAP file. translate the SNAP file manually. Machine Check Stack Frame for the the EHM over wrote the following the second error. | CONSOLE MESSAGES Table 18 MCPSNP Messages (KAF Snap Shot Routine) NS WBUS o WO SR (cont.) G S I R A WD R B RN SR BN VD N SR A SR, W PARITY ERROR - The EBox updated all copies of the GPRs with bad parity. This is a non-recoverable error condition. The EHM responded to this error condition by looping at EBox UPC 24 which in turn resulted in this the WBus the WReg. data Most KAF. The symptoms indicate that the pari ty calculated on did not match the parity cal late Probable Cause: Module Probability L0209/EDP LO219/EBE L0206/IDP LO212/FBA L0223/FTM High High Low Low Low (FBox Terminator) CPU ERROR HALT - The KAF Condition was initiated by the Console Support Microcode (CSM). The specific reason for the KAF is contained in the Master Header Record Byte 13 (CSM Status Word Entry Code.) See list below. 0 = CSM could not be KAF. the 4 = The system. Interrupt forced to run by EBox microcode may have If Stack the console program after been corrupted, that doesn't work run not or an exception, Interrupt Stack it valid. The the re-initialize the Micro-diagnostics. CPU was processing an but when it attempted discovered. that the interrupt to push "state"TM on the Interrupt Stack was mapped "NO ACCESS" or "NOT VALID". This generally indicates that the CPU got into a loop handling an interrupt or exception and as a result the interrupt stack overflowed to a page mapped "NO ACCESS" or "NOT VALID". Translate If the the SNAP file. in. 1so look at the contents of the CPU was in an interrupt loop you may get s source. Finally, look the EScratch Record error. at to .dea of the and the Machine Check section of determine if the CPU was handling an CONSOLE Table W SR ROER GENR WG MESSAGES 18 RN G S ut MCPSNP Messages W GO W s (KAF Snap If the double error. There are processing an error and a second detected, the EHM will call CSM with STATUS. In thlS case you will find a 0 first the VMS second for the Machine error of 5 Check Handler was handling an error and a In this case you will find the Check Stack Frame on the 1nterrupt stack,, Machine Check Stack Frame in the EScratch. Mode in determine examine Halt HALT. Kernel if that éssor Scra Instruction. reason VMS and Frame 5 1in first Machine the second The processor Mode. Examine the p )X , detected, the EHM will build a Stack error and then call CSM with a code of second CSM.STATUS. code error) in the was this (non-EBox) a ill identify the port that the second error. Kernel while two ways was CSM If (cont.) EHM was error b. Routine) mm*mmmmmmmwmmmmmmm Non-EBox or "VMS ENTERED" ~condition can occur: a. Shot SRR m“mwmmflmm”mwmmflmwflmnm“wmwmmmmu\.mm“w Pad 1 Then executed a HALT instruction the SDB EBOX OPCODE Register to actually executed a halt. If so use 1 the 2E t N listings to VMS Halted. @”EE* f look up the the SCB vector with <1:0> 3. "The Vector Code Field in the System Control Block <1:0> was equal to 3 which is a reserved code. Either the SCB was overwritten or a wrong vector address was generated. SCB vector with <1:0> = 2. The Vector Code Field in the System Control Block <1:0> was egqual to 2 which means: service this event in Writeable Control Store (WCS). However the WCS either does not exist or was not loaded. The result of the operation is a HALT. Again, either the SCB was overwritten or a wrong vector address Pending error an error Interrupt the was on generated. HALT. condition Stack, contents the of The when Machine CPU was in P typed on section of was Check EHSR may provide the CPU was processing when P was some vector instruction <1:0> and the not 0. The idea the the of the of handling console. The EScratch and error type of typed. CHMx with IS = 1. The CPU executed when PSL <26> (Interrupt Stack) was _CHMx the process a Change set. CPU SCB Vector Code Mode | executed <1 0) was instruction a Change Mode to zero. not equal oy . 4-32 CONSOLE MESSAGES Table 18 MCPSNP Messages UNCORRECTABLE CS PARITY (KAF Snap Shot Routine) ERROR (cont.) - The Console was unable to correct a Control Store for one of or ~Dispatch the following reasons: INTERR CSPE Interrupt, Code RAM L Invalid (0) MBTERR "ram id" Multi-bit-error, Uncorrectable MUNREC MCS MBox Not NOECCD "ram id" No PCFAIL "ram id" Correction Attempted and Failed SYNGTR "ram id" Syndrome > RAMSize, SYNZRO "ram id" Syndrome = 0, UPCERR "ram id" Can't Recoverable ECC Data in Table Uncorrectable Uncorrectable Read Box Address Refer to Table 10 for a desdription of uncorrectable Control Store and Dispatch parity errors. Examine the contents o determine the RAM, Address, and Syndrome. Then refer to Callout Tables in the section on Manual Stack Frame Ana1y51s identify the failing RAM. POWER SYSTEM FAILURE (DC LOW) to - During a normal power failure AC Low will proceed DC Low by approximately 10 milliseconds. This allows VMS enough time to sweep the cache of the GPRs and CPU Registers, etc. If, however -~ - e UNKNOWN The to MACHINE KAF HANG the Status Most likely word exact cause af d the faxlure. - timer expired and CSM micro determine but was the KAF Routine unable the system is stalled loop waiting for some event read the EBox UPC and to determine the cause the KAF. (ESTALL or to occur. ISTALL) or of hung the in a Approach: file. Contact the RDC and request that they analyze the SNAP Meanwhile, if possible use VSRBLD to translate the SNAP file. Otherwise translate the SNAP file manually. partial Machine Check Stack Frame. The ESC may contain a CONSOLE MESSAGES Additional MCPSNP.LST D SRR A R R AR GRS A TN RN SN SN R GO Both SNAP Files N SRS CRONE KA RSN SN (KAF Routine) UNEE G S DU SRR WS SN W OO G o AR N R Messages O W W SN S GORER A R S Still Valid The KAF Routine was called to capture (Snapshot) the state of the system but both SNAP Files were still valid. That is, they still contain status captured during the previous two KAF Conditions. As a result the Console will not capture the state of the machine. Use the Console command "SET SNAP INVALID" to 1invalidate both SNAP Files. SNAPx.DAT Created Currently there two Snap file names used by the console; SNAP1l.DAT and SNAP2.DAT. This message tell you the name of the Snap File that the KAF Routine created as a result of the KAF Condition. gy CHAPTER THE VMS SYSTEM 5 EVENT FILE THE VMS SYSTEM EVENT FILE OVERVIEW The VMS Operating ERRLOG.SYS. The System file is maintains located in and is used to record certain operation. The types of events are listed in Table 1. Table 1 Entry Code Description NN M OSSO SRR R System Event File SYS$SYSROOT: [SYSERR] events that occur that are recorded in 'called directory during system the event file System Event File Entry Type Definitions R R a the R R R o U A A R R oS RN D NN WL W N M R W KIS AR R SR SR IRD R W I SR N EUREE 01 02 Device Error Machine Check 04 05 06 Bus 07 08 09 10 11 Asynchronous Write Error Hard ECC Error 11/780 Unibus Adapter error 11/750 Fault Through SBI Vector 11/730 Unibus Error 12 13 11/780 Massbus Adapter KA86 SBIA Error 14 15 16 KA86 CRD Log KA86 Environmental Monitor KA86 Processor Error Halt 17 32 KA86 Cold 35 36 37 New File Created Warm Start (ie: System Power Crash Re-start 38 39 40 41 42 Time Stamp Entry System Service Message System Bugcheck Operator Message Network Message 64 65 96 97 98 Volume Mount Volume Dismount Device Timeout Undefined Interrupt Asynchronous Device Attention 99 100 101 112 273 Software Parameters Logged Message Logged MSCP Message User Bugcheck Unknown Entry Error SBI Alert Soft ECC Error S Error Console Reboot Start (ie: System Boot) Recovery) (O, Each time one of these events occur, the normal operation interrupted and a special routine is called routine requests a System Event Buffer and 5~2 of VMS is to handle the event. The then gathers predefined THE VMS information about the event (e.g., system SYSTEM status, EVENT FILE hardware and software registers, etc.) and puts it in the buffer. Once the buffer is built the routine queues a request to append the buffer to the System Event File. When the queue is processed the buffer is appended to SYS$SSYSROOT: [SYSERR]ERRLOG.SYS. This process takes place anytime one of the events listed in Table 1 occur. Two programs (ANALYZE/ERRORLOG and RETRIEVE a Spear Library function) are available to translate the contents of the System Event File into ASCII reports. Both of these programs use the Error Log Formatter (ERF) to translate the entries in the Event File. Therefore, regardless of which program you use the format of the translated entries will be the same. The main difference between the two programs is the command syntax, the selection <criteria, and the format of the Summary reports they produce. In addition to translating system event file entries Spear is capable of analyzing the contents of the event file and calculating system availability. ANALYZE/ERRORLOG ANALYZE/ERRORLOG uses a Command, and The Qualifiers Qualifers allow you non-interactive Arguements, to select command syntax. are entered specific That is, the in a single string. entries from a binary System Event File and either produce a seperate binary event file that contains only those entries, or translate the entries and produce an ASCII Report. For a complete description of the this including more information about the ANALYZE/ERRORLOG command qualifiers , see the VAX/VMS Utilities Reference Volume. COMMAND utility, and its SYNTAX: ANALYZE/ERROR_LOG [/qualifier=arguement][,...]] command can o The base o All Qualifiers o Multiple Arguements o In some case special characters such as the equal sign, parenthese, and colon are required. If the qualifier requires special characters they will appear in the syntax examples shown in Table 2. are be abbreviated [file-spec[,...]]<cr> preceeded by a to ANA/ERR. slash. to a Qualifier are seperated by a 5-3 comma. THE VMS Table SYSTEM 2 EVENT summarizes ANALYZE/ERROR LOG. column. qualifier qualifier In the FILE the The syntax qualifiers full and qualifier example to the defaults 1is associated spelled out right of the 1is abbreviated to its most common form. is described below the syntax example. with in the left column the The effect of | the Table 2 ANALYZE/ERROR _LOG Command Qualifiers and Defaults ANALYZE/ERROR_LOG ANA/ERR<cr> Translate the entire system event SYSSSYSROOT: [SYSERR]ERRLOG.SYS and output ASCII report of each entry on the terminal. This is the default case. ‘The INPUT - default input follows: FILE The defaults SYS$SYSROOT: [SYSERR] ERRLOG.SYS. OUTPUT - The default output which is sent to SYS$OUTPUT. SYSSOUTPUT is your terminal. QUALIFIERS - The default file full a are as file 'spec is 1is an ASCII report. The system default for qualifiers are: /FULL /ENTRY=(START:1,END:EOF) ANALYZE/ERRORmLOG ANA/ERR ERRLOG.OLD<cr> | Translate the entire system event (ERRLOG.OLD;5) and output a full each entry on SYSSOUTPUT. file specified ASCII report of With the exception of the input file specification the defaults for this case are the same as above. Any binary (untranslated) system event file may be specified as input. /BEFORE ANA/ERR/BEF=16-AUG-85-10:35 ERRLOG.OLD;5<cr> ANA/ERR/BEF=-3-:12:30 ERRLOG.OLD;5<cr> Select only those entries "date~-time" specified. dated earlier than the The qualifier accepts absolute time (begining August 16,1985 at 10:30), delta time (begining 2 days, 11 hours, and 30 minutes ago), or a combination of both. For further details on specifying times refer to Section 2.5 in the VAX/VMS DCL Dictionary. /BINARY ANA/ERR/INCLUDE=(DISKS)/BIN=FS:DISK.ERRORS<cr> Do not translate the selected write them in the directory and no directory is specified use directory. as the file You must If no file type entries. Instead file specified. If the users default is specified, use .DAT omit the type. supply a file name. If you directory it will default to the directory you are using. If you omit the file type it will default 5-4 i, THE to: VMS SYSTEM following qualifiers should not conjunction with the /Binary qualifier: /BRIEF /FULL a full Instead containing key in for an information each selected abbreviated about report each entry. ANA/ERR/ENT=(START:12,END:29)<cr> START:1,END:EOF, Entry Numbers specified. If either END arguement is omitted default to ANA/ERR/EXC=(MTAO0,DRA5) ERRLOG.OLD:5<cr> Do not select any entries generated for the Class, Device Name, or Entry Type specified. Device The acceptable Device and Entry under the /INCLUDE qualifier. listed keywords are ANA/ERR/INCLUDE=(DISKS)/FULL ERRLOG.OLD;5<CR> Generate a full ASCII report for the entries specified. | This is not the default need to See Examples: report be ANALYZE/ERROR_LOG /NOFULL report generate only Select only the the START or /FULL used ERRLOG.OLD;5<cr> generate entry. /EXCLUDE be /OUTPUT /SUMMARY /REGISTER_ DUMP ANA/ERR/BRI Do not /ENTRY FILE DAT. The /BRIEF EVENT 1 format specifed command through and as normally part does of the string. 15 ANA/ERR/STATISTICS/NOFULL ERRLOG.OLD;:5<CR> Do not generate a full ASCII report for the specified. entries This Qualifier is normally used when you only want a special ASCII report such as a Summary or Statistical report. If you don't specify NOFULL, a full translation of the selected entries will preceede /INCLUDE the Summary or Statistical Report. ANA/ERR/INC%(MACHINEWCHECKS,BUGCHECKS)<cr> Select Class, only those entries generated for the Device Name, or Entry Type specified. The acceptable below. Device BUSES Device Class All - All - All SYNC COMMUNICATIONS 5-5 Entry keywords are listed Keywords = DISKS REALTIME and Device All Bus related Entries Disk Related Entries Realtime Related Entries Synchrounious Line Entrie. THE VMS SYSTEM EVENT FILE TAPES - All Tape related Entries Device Physical DB DB, DR, XF DBAl DBAl,HSC1$DUA1,DYAO - Entry Types ATTENTIONS BUGCHECKS CONTROL ENTRIES CPU_ENTRIES DEVICE ERRORS MACHINE CHECKS MEMORY TIMEOUTS UNKNOWNMENTRIES - Name Constructs An entire group of devices A list of device groups A specific device/unit number A list of devices device attention entries bugcheck entries | Control Entries CPU Related Entries Device Error Entries Machine Check Entries Memory Error Entries Device Timeout Entries All Entries that had either an unknown entry type or an unknown device type/class. UNSOLICITED MSCP VOLUME_CHANGES /LOG Unsolicited MSCP Entries - Volume Mount and Dismount Entries ANA/ERR/LOG ERRLOG.OLD;5<cr> Send a message to the SYSSOUTPUT stating the number of entries that were selected and rejected for each input file. Refer to the /REJECT qualifier for of rejected entries. /OUTPUT an See Example: 16 ANA/ERR/OUT=ERROR_LOG.LST ERRLOG.OLD; 5<cr> Do not prlnt the ASCII Report. Instead save report in the file specified. If specified write the report into xxxx.LST is the name of the input file). /REGISTER_ DUMP explaination no file the is (where xxxx ANA/ERR/INCLUDE=(CPU)/REG ERRLOG.OLD;5<CR> Do not use the specified (Brief/Full) format for translating Memory, Device Error, and Device Timeout entries. Instead select only the register information from those entries and translate that information into hexadecimal 1longword (cryptic) format. Use the specified format for translating all other types of selected entries. This qualifier requires that the INCLUDE qualifier be .part of the command string. Also, regardless of whether or not they were specified as by INCLUDE Qualifier, all Memory, Device Error, Device Timeout entries will be selected and translated in cryptic format. See Example: | 17 5-6 THE /REJECT VMS SYSTEM EVENT FILE ANA/ERR/INCLUDE=(M ) /REJ=ERRORS TAQO .BIN ERRLOG.OLD; 5<CR> Put all rejected entries in the file specified. Do the entries, write them in binary format. If no file is specified write the entries into xxxx.REJ (where xxxx is the name of the input file). not translate Rejected entries consist of all entries that were not specifically selected in the command string. That is, those entries that were outside the time window specified by either the /SINCE, /BEFORE arguements; those entries range specified arguements; those by that the were not with /ENTRY(START: in the ,END: ) entries that did not match the arguements; and those entries that were specifically rejected by the /EXCLUDE agruements. /INCLUDE /SID_REGISTER /SINCE ANA/ERR/SID=%X0405F09E ERRLOG.OLD;5<CR> Select only those entries that were reported CPU associated with the System ID specified. by ANA/ERR/SIN=16-AUG-85-10:35 ERRLOG.OLD;5<cr> ANA/ERR/SIN=-3-:12:30 ERRLOG.OLD;S5<cr> Select only those entries that occurred on or the date and time specified. You can specify an absolute time (begining the after August 16,1985 at 10:30), a delta time (begining 2 days, 11 hours, and 30 minutes ago), or a combination of absolute and delta times. For further details on specifying times refer to Section 2.5 in the VAX/VMS DCL /STATISTICS Dictionary. ANA/ERR/NOFULL/STAT ERRLOG.OLD;5<cr> Generate a and append statistical report to the end of the ASCII report that states CPU Time used and the number of page faults, buffered I/0, and direct I/0, that occurred during the execution of the ANALYZE/ERROR_LOG command. See /SUMMARY Example: 18 ANA/ERR/NOFULL/SUM=(DEV,MEM) ERRLOG.OLD:5<cr> Generate a summary report for each of the report types specified by the keyword and append the report(s) to the end of the ASCII report. If no keywords are supplied, generate a full set of summary The the reports. following is a type of report Keyword Meaning W A A T R S DEVICE U N - A SO Include the list of the Summary Keywords they will generate. the report. 5=-7 Device Rollup section in and - THE VMS SYSTEM EVENT FILE ENTRY HISTOGRAM MEMORY VOLUME Include the Summary of Entries Logged section 1in the report. Include the of Processed Entries Day Histogram in the report. Hour Include the Summary of Memory Errors 1in the report. section Include the Volume Label section in the report. See Examples: 19 through 22 THE VMS SYSTEM EVENT FILE SPEAR In contrast to ANALYZE/ERRORLOG, Spear uses an syntax. The wuser 1is prompted for arguements addition to interactive prompting Spear supports facility Because as of well this as online a built in tutorial documentation it the Spear dialogue in this manual. Spear, review the Instruct package and necessary. is interactive function called not command and qualifiers. 1In an extensive help necessary to INSTRUCT. document Instead you are directed to then use the help facility run as ” 5-9 ~THE VMS SYSTEM EVENT FILE EXAMPLES The following examples represent sample reports produced by the Error Record Formatter (ERF). These reports have been included so that you will have some idea of the type of information that can be extracted from System Event files using either ANALYZE/ERROR LOG or SPEAR. The following is a list of the examples and the corresponding Types: Example (Entry (Entry (Entry (Entry (Entry Example Example Example Example Type Type Type Type Type 002) 006) 013) 015) 016) Example Example Example Example (Entry Type 017) (Entry Type 032) (Entry Type 037) (Entry Type 040) (Entry Type 096) Example (Entry Type Example Machine Check Soft ECC Error KA86 SBIA 11/780 KA86 Error Environmental Monitor Processor Halt KA86 Console Reboot Cold Start Crash Re-start System Bugcheck Device Timeout 14: 15: 098) Asynchronous Device Attention (Entry Type 273) Unknown Entry ANALYZE/ERROR LOG/LOG Report Format ANALYZE/ERRORMLOG/REGISTERMDUMP Report Format ANALYZE/ERROR LOG/STATISTICS Report Format Example 16: 17: ANALYZE/ERROR LOG/SUMMARIZE=(DEVICE) ANALYZE/ERROR LOG/SUMMARIZE=(VOLUME) Example Example 18 19: ANALYZE/ERROR:LOG/SUMMARIZE%(HISTOGRAM) Example Example Example Example Example ANALYZE/ERROR LOG/SUMMARIZE=(ENTRY) Report Report Format Format Report Format Report Format Entry THE Example 1: Machine Check VAX/VMS | SYSTEM ERROR REPORT COMPILED PNTRY ERROR SEQUENCE 43. 1, EVENT FILE 2-JUL-1985 KA86 REV# 6-SEP-1985 16:46 PAGE 1. *hkkkkhkhhhkkrhhhhhh kA hhhhk | EHMSTS SYSTEM (Entry Type 002) khkkhkkhkhkhkhhkkhhhhkkkkkkkk*® MACHINE CHECK VMS LOGGED ON SID 0405F270 17:34:44.00 5. SERIAL# 624. MFG PLANT 15. 41001803 VMS ERROR CODE = IBOX MICRO TRAP VECTOR = 18 (X) IBOX SP CORR EHM EVMQSAV ENTERED 0008073D | | VIRTUAL ADDRESS FOR EBOX PORT __ REQUESTS EBCS - 00002000 IBOX EDPSR 00000000 CSLINT 00606E1F ERR C BUS ADDRESS = 1lF (X) C BUS DATA = 6E (X) INTERRUPT PRIORITY REQUEST = I/0 ADAPTER = 3. IBESR 0. 00806000 UOP SEL UTPR = IBOX <2:0> ENABLE ETRAP IAMUX PARITY EBXWD1 REGISTER SELECT = FORK(IB PORT, IBOX ERR) ERROR 00000051 TOP OF "SP STACK" __ CONTENT IS ONE OF THE LAST _ LONGWORDS WRITTEN TO MBOX EBXWD2 00A00040 TOP OF ~ VASAV : "SP STACK" 00011B04 - VIRTUAL ADDRESS __ PORT 0008074E - PRE-FETCH 0008073E PC OF PORT REQUEST TO OF INSTRUCTION _ EXECUTION AND ISASAV FETCH FOR OPERAND AND RESULT VIRTUAL ADDRESS ESASAV FOR OP REQUEST ADDRESS — CALCULATION VIBASAV MINUS ONE __ CONTENT IS ONE OF THE LAST LONGWORDS WRITTEN TO MBOX DELIVERY NEXT FILL IBUF IBUFFER DURING RESULT EBOX STORAGE 00080742 PC OF INSTRUCTION WHICH VA _CALCULATION UNIT IS DOING ADDRESS _ CALCULATION OR OPERAND PRE-FETCH _ OR IS PASSING OPERAND DATA 5-11 "THE VMS SYSTEM EVENT FILE 00080742 CPC PC OF INSTRUCTION _DECODE UNIT ~ MSTATI IN 84004000 HIT BLOCK ABUS ADAPTER = 0. WORD COUNT = 0. CYCLE TYPE = READ REGISTER DEST CP 00000F00 MSTAT2 = EBOX DIAGNOSTIC STATUS FROM SBIA RD COM/MSK RD DAT L/S <3:0> <1:0> = F = 0 (X) (X) PAMM DATA = ARRAY #0.,SLOT #1. MDECC 00060400 MERG 00000100 CSHCTL 00001003 (* DATA NOT VALID *) MEMORY CACHE CACHE MANAGEMENT 0 1 ENABLE ENABLE ENABLE 0000007C MEAR PHYSICAL ADDRESS IN AT TIME OF ERROR = MEDR 0000001F FBXERR FFFFFFFF CSES FFFFFFFF ERROR PC 00080742 ERROR 03C00028 PA LATCH 0000007C DATA WORD USED DURING ERROR (* DATA NOT VALID *) PSL (* DATA NOT VALID *) N-BIT | INTEGER OVERFLOW TRAP ENABLE INTERRUPT PRIORITY LEVEL = PREVIOUS MODE = USER CURRENT MODE IOA ES 00000000 (* 5-12 USER DATA NOT VALID *) 00. THE VMS Example 2: Soft ECC Error VAX/VMS (Entry Type SEQUENCE SYSTEM ERROR REPORT CRD ERROR RATE - TOTAL CORRECTED DATA CORRECTED 1. ERROR COMPILED 1. 10-SEP-1985 CRD LOGGING ERRORS MDECC REV# LOGGED FILE 1-0CT-1985 08:57 **************************** LOGGED KA86 HIGH | 2009. CORRECTED MEMORY ERROR EVENT 006) khkkhhhkhkhkhkhkhkhkhkhkhkhhkkhkhkhkhkhhkhkihkh ENTRY ERROR SYSTEM ON SID O4FFFFFF 17:15:19.03 255. | SERIAL# 4095. MFG PLANT 15. DISABLED FOR THIS ENTRY 9. 00260000 SYNDROME | DATA MEAR = CORRECTED SINGLE BIT CHECK BIT CO. ERROR 3FFFFFFC | PHYSICAL : MSTATI | ADDRESS PA LATCH AT TIME OF ERROR = 3FFFFFFC C3000000 ABUS ADAPTER = WORD COUNT = CYCLE TYPE = | MSTAT2 IN DEST 09000001 CP = 0. 3. NOP IBF (LOAD IBTP FROM MCC) | MBOX LOCK PAMM DATA = ARRAY #0.,SLOT #1. SMUS. CORRECTED ERROR MDECC | 2. 00261400 SYNDROME = CORRECTED DATA BIT #l. DATA MEAR MSTATI1 SINGLE BIT ERROR 012AFCO00 PHYSICAL ADDRESS IN AT TIME = ANY REFILL TAG MISS OF ERROR PA 64006006 CO BLOCK ABUS WORD CYCLE DEST 5-13 HIT ADAPTER = 0. COUNT = 0. TYPE CP = = OP CP REFILL FETCH LATCH 012AFCO00 THE VMS SYSTEM EVENT MSTAT2 FILE 00040F00 DIAGNOSTIC CORRECTED ERROR STATUS ~— RD COM/MSK RD DAT L/S PAMM DATA = FROM <3:0> SBIA = F <1:0> = ARRAY (X) 0 (X) #4.,SLOT #5. 3. MDECC 00261400 SYNDROME DATA = CORRECTED SINGLE BIT DATA BIT #1. ERROR 01297400 MEAR PHYSICAL AT TIME MSTAT1 ADDRESS OF IN ERROR = PA LATCH 01297400 64006002 ANY REFILL CO0 TAG MISS BLOCK ABUS MSTAT2 HIT ADAPTER = 0. WORD COUNT 0. CYCLE TYPE DEST CP = OP CP REFILL FETCH 00044F00 DIAGNOSTIC STATUS RD COM/MSK RD DAT L/S ~ ABUS BAD PAMM DATA = FROM SBIA <3:0> = F (X) <1:0> = 0 (X) DATA ARRAY CODE #4.,SLOT #5. ro* CORRECTED ERROR MDECC 4. 00261400 SYNDROME DATA 'MEAR MSTAT1 = CORRECTED SINGLE BIT DATA BIT ERROR 01297400 PHYSICAL ADDRESS IN AT = TIME OF ERROR PA LATCH 01297400 64006002 ANY REFILL CO0 TAG MISS BLOCK HIT ABUS ADAPTER = WORD COUNT CYCLE DEST MSTAT?2 TYPE CP = = 0. = CP OP 0. REFILL FETCH 00040FO00 DIAGNOSTIC RD 5-14 STATUS COM/MSK FROM <3:0> ~ RD DAT L/S <1:0> PAMM DATA = ARRAY SBIA = F (X) = (X) 0 #4.,SLOT #5. ¢#l. THE VMS SYSTEM EVENT FILE Example 3: KA86 SBIA Error (Entry Typev013) VAX/VMS ~ SYSTEM ERROR REPORT COMPILED 6-SEP-1985 | 12:55 PAGE kkkhkkkhkhkkhhkhkkhkhkhkkkkkkxk* ERROR SEQUENCE SBIA ERROR ENTRY 1, *Akkrk kAR kAR 66. ERROR PC ERROR PSL 19:17:12.92 SERIAL# 5. 624. MFG PLANT 15. 80008B1F 00000000 * INTERRUPT PREVIOUS PRIORITY MODE CURRENT MODE IOA ADDRESS DMAI CMD/ADDRS 80029200 403C747E DMAI '0000000E ID RRARRAR AR R Ak k LOGGED ON SID 0405F270 2-JUL-1985 REV# KA86 LEVEL = KERNEL = KERNEL = 00, (* DATA NOT VALID *) (* DATA NOT VALID *) DMAA CMD/ADDRS 18001800 DMAA 00000010 ID DMAB CMD/ADDRS 103C747F DMAB 0000000E ID DMAC CMD/ADDRS BO3EOQ9FC DMAC 0000000E ID IOA DC 00000000 IOA ES IOA CS 1C000000 EE000000 | (* DATA NOT VALID *). (* DATA NOT VALID *) (* DATA NOT VALID *) (* DATA NOT VALID *) (* DATA NOT VALID *) (* DATA NOT VALID *) CPU IOA CF 01000010 TR SELECT = 2. ENABLE SBI CYCLES IN ENABLE SBI CYCLES OUT MASTER INTERRUPT | | SOFTWARE ENABLE | REQUIRED SBI. REV = 0. SBI 16M OF MEMORY ADDRESSABLE SBIA FS 040F0000 FAULT SBI | SBIA SC SILO LOCK FAULT FAULT INTERRUPT FAULT LATCH TRANSMITTER ENABLE DURING FAULT 00000000 COUNT FIELD = 0. COMPARE TAG = 0. COMPARE CMD/MSK = SBIA MT SBIA ER SBIA TA 1. 0. 00000000 00000000 0802000E (* DATA NOT VALID *) 5-15 (ABUS) THE SBI VMS SYSTEM SILO EVENT LOCKED, FILE DETAILED SUMMARY 00000000 1C000000 VALID READ DATA ID = 0. 00000002 TR 1. ACTIVE 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ADAPTER TR$ 3. "DW" CSR 00000028 ADAPTER ADAPTER TR$¢ CNFGR IS UBA 0. 14. 20180038 ADAPTER IS "CI" READ DATA TIMEOUT COMMAND TRANSMIT TIMEOUT UNEXPECTED READ DATA FAULT 5-16 THE VMS SYSTEM EVENT FILE Example 4: KA86 Environmental Monitor VAX /VMS (Entry Type SYSTEM ERROR REPORT 015) COMPILED 9-SEP-1985 | Akkkhhkhhkhhkkhhhkhkkhksxkhkdd® ERROR SEQUENCE EMM EXCEPTION EXCEPTION 9-JUL-1985 14:37:41.01 KA86 5. *kkhkhhhhhhhhhkhhhhhhkdkhhhhhk REV# SERIAL# TEMPERATURE, 158. MFG PLANT THE TEMPERATURE ENTRY R IS NOW 15. IN LOGGED ON 9-JUL-1985 14:37:44.43 REV# 5. SERIAL# IN T1 TEMPERATURE, 158. | MFG PLANT THE TEMPERATURE 5-17 YELLOW ZONE e T 9414. KA86 STATUS CHANGE 1, 1. LOGGED ON SID 0405F09E khkhkRhRkAXRRAARRXRARARRAAXE EMM DNTRY 9413. STATUS CHANGE IN T1 ERROR SEQUENCE 08:30 PAGE SID 15. IS NOW NORMAL 0405F09E R THE VMS Example SYSTEM 5: EVENT KA86 FILE - Processor Halt VAX/VMS SYSTEM (Entry Type 016) ERROR REPORT COMPILED 9-SEP-1985 . kkkkkkkkkkkkkkkkkkkkkkkkkkk*k ERROR SEQUENCE KAF SNAPSHOT ENTRY 12-JUL-1985 REV# 09:48:27.50 5. SERIAL# kkkkkkkhkkkkkkkkkkkkkkkkkkkx* SNAPSHOT REV# SERIAL# kkkkkkkhkkkkhkkkkkhkhkkkkkkkx* KAF SNAPSHOT 9, REV# | hkkkkhkkhkhkkhhkhhkkrkhhhhhhhhohk MFG PLANT 12-JUL-1985 3, SID . 0405F270 | 15. - 09:48:33.68 KRARRRkhRRRIARRKARKRAK AR KA KKK | 12-JUL-1985 KA86 — | 624. ENTRY 1116. — 15. 09:48:38.06 5. 0405F270 - 12-JUL-1985 09:48:22.53 ENTRY SYSSSYSROOT: [SYSERR] ERRSNAP.LOG; 26 - ERROR SEQUENCE MFG PLANT LOGGED ON 12-JUL-1985 SID | 624. 1058. KA86 1. *kkkkkhkhkhhkhhkhkhhhhhkrkhkhhhhhk LOGGED ON SYSSSYSROOT: [SYSERR] ERRSNAP,LOG; 25 KAF 1, 1053. KA86 ERROR SEQUENCE 08:52 PAGE LOGGED ON SID 0405F270 o 15:47:01.59 5. SERIAL# SYS$SYSROOT: [SYSERR] ERRSNAP,.LOG; 27 624. MFG PLANT 12-JUL-1985 15. 15:46:55.83 L, THE Example 6: KA86 Console Reboot VAX /VMS | VMS SYSTEM EVENT FILE (Entry Type 017) SYSTEM ERROR REPORT COMPILED | kkkkkhkkhhhkhhhkhhkhhkkhkkhtx** ENTRY 1, RERAA R AR AR AR R AR R ERROR SEQUENCE 82. 7: Cold Start VAX/VMS '~ 4-FEB-1985 KA86 REV# (Entry Type 16:30:46.62 255. SERIAL# 4095. MFG PLANT 15. 032) SYSTEM ERROR REPORT COMPILED 9-SEP-1985 | ENTRY l‘ 0. SYSTEM START-UP TIME OF 09:00 PAGE khkkhkhkhkhkkhkhhhhkhkhhhkhkhkhhhhkkikhkihk ERROR SEQUENCE R AR AR A AN LOGGED ON SID O04FFFFFF CONSOLE REBOOT SUCCESS Example 6-SEP-1985 12:58 PAGE 1. DAY CLOCK 1. ***********************t**** LOGGED ON SID 0405F24F 10-JUL-1985 KA86 REV# 15:56:51.62 5. 72306E69 SERIAL# 591. MFG PLANT 15. THE VMS SYSTEM EVENT Example 8: FILE Crash Re-start (Entry Type 037) VAX/VMS - COMPILED 9-SEP-1985 09:08 SYSTEM ERROR REPORT | **************"************** ENTRY ERROR SEQUENCE 6906. | | PAGE 1‘ **************************** LOGGED ON SID 0405F09E | 24-JUN-1985 21:15:03.51 FATAL BUGCHECK KA86 | REV# 5. 1. SERIAL# 158. | * MFG PLANT 15. OPERATOR, Operator requested system shutdown kkkkhhkhkkhhhhrhkkkkkkkkkkkx* ENTRY ERROR SEQUENCE 8075. 2, khRkkhhRRKARRAARAhhhhkkhkkkk LOGGED ON SID 0405F09E | 1-JUL-1985 12:34:20.33 SERIAL# 158. KA86 REV# 5. - FATAL BUGCHECK MFG PLANT 15. MACHINECHK, Machine check while in kernel mode .A473373:...00.0 PROCESS NAME 00070122 PROCESS ID 80245D62 ERROR PC 045F0008 ERROR PSL N-BIT w INTERRUPT PRIORITY LEVEL = 31. PREVIOUS MODE = EXECUTIVE = KERNEL CURRENT MODE INTERRUPT STACK STACK POINTERS KSP JFFE7EQ0 ESP 7FFE9D80 SSP 7FFEDO4E USP 7FF6D91C ISP 806BOF50 GENERAL REGISTERS RO 00000000 RI1 00001F73 R2 000000AA R3 7FFBA207 R4 7FF83207 RS 7FFBA21C R10 7FFBA205 R6 7FFBA668 R1l1 7FFBA2CD R7 AP 7FFB9D40 00000003 R8 FP 7FFBA223 R9 SP 7FFE9DE4 00000006 806BOF94 SYSTEM REGISTERS POBR POLR | 80A5C200 P1BR 000002DB | 8027F200 P1LR 001FFB5B PO PTE BASE TOTAL PO Pl (VIRT ADDR) PAGES PTE BASE (VIRT ADDR) TOTAL NON-EXISTENT Pl 5-20 PAGES e THE VMS SBR O0FC3400 SLR 0000F300 PCBB 00423878 SCBB OOFBF600 SYSTEM PTE BASE TOTAL PAGES ASTLVL 00000004 SISR 00000000 ICCS 800000C1 SYSTEM °'SYSTEM' (PHY ADDR) SCB (PHY ADDR) NO AST'S INTERRUPT VIRT MEM PENDING REQUEST ACTIVE RUN ICR FFFFDA62 TODR 6D7B5185 INTERRUPT INTERRUPT ERROR ENABLE INTERVAL COUNT REGISTER 5-21 FILE (PHY ADDR) PCB BASE BASE EVENT = 0. THE VMS SYSTEM Example 9: EVENT FILE System Bugcheck VAX/ VMS (Entry SEQUENCE ENTRY COMPILED 1. 9-SEP-1985 09:11 PAGE LOGGED ON SID 0405F09E 4-JAN-1978 09:54:51.52 KA86 Unexpected REV# interrupt 5. SERIAL# NAME eNULL.:ccoosoooo PROCESS ID 00010000 80004680 ERROR PSL 04170000 MFG PLANT 15. or exception PROCESS ERROR PC 158. INTERRUPT PRIORITY LEVEL = 23. PREVIOUS MODE = KERNEL = KERNEL CURRENT MODE INTERRUPT STACK STACK POINTERS KSP 00000100 GENERAL | ESP 00000100 SSP 00000100 USP 00000100 ISP 806BOFAC R3 R8 FP R4 R9 SP REGISTERS RO R5 00006E2A 801BEEll RI1 R6 R10 00000000 R11 1. hkhkhhhhkk hhrhhh Kk kkhhhkkkR 0. NON-FATAL BUGCHECK UNXINTEXC, 040) SYSTEM ERROR REPORT kkkkkkkkkkkkkkkkkkkhkkkkkxkk*x ERROR Type 00006E29 801BEDD8 800036BO R2 R7 AP 00000000 805A86C0 FFFFFFFF 5-22 B8O1BEE1D 805A8A80 A0000000 00000149 00000000 806BOFFO THE VMS Example 10: Device Timeout VAX /VMS (Entry Type SYSTEM DEVICE" COMPILED 1., ENTRY ERROR SEQUENCE 334. "UNKNOWN EVENT FILE 096) ERROR REPORT khhkhkdkkhkhkhkhkhhhkdhddddhdhhhkiihikkikk SYSTEM 9-SEP-1985 09:16 PAGE 1. AR AARARR RN R RRRRARA ARR AR R LOGGED ON SID O4FFFFFF ENTRY 25-JAN-1985 KAB86 REV# 20:06:38.09 255. SERIAL# 4095. MFG PLANT 15. ERROR LOG RECORD ERF$L_SID ERLSW_ENTRY EXE$SGQ SYSTIME ERL$SGL_SEQUENCE O4FFFFFF SYSTEM UCB$B_ERTMAX 00 UCB$SB_DEVTYPE 00 0000 IRPSW_BCNT 0000 0000 UCB$W_ERRCNT 0001 UCBSL_OPCNT 00001B22 ORBSL_OWNER 00000000 dCBSLmDEVCHAR 0C402000 IRP$W_FUNC RETRIES = = 334. = 0. 0. IOSB STATUS DEVICE CLASS DEVICE TYPE REQUESTING = = 32. 0. PROCESS ID TRANSFER BYTE OFFSET TRANSFER 800029CO0 UCB$W_UNIT ERROR LOGGED RETRIES DEVICE 000100F7 IRPSW_BOFF UCBSB_SLAVE REMAINING FINAL WHEN ERROR SEQUENCE MAXIMUM 20 UCBSL_MEDIA 64 BIT TIME 0000022C 00000000 0150 UCB$B_DEVCLASS IRPSL_PID ENTRY TYPE UNIQUE 00 UCB$W_STS ERROR 93C156A0 008D7AS56 014E UCB$SB_ERTCNT IRP$Q IOSB ID REGISTER 0060 BYTE = COUNT = 0. 0. DEVICE DEPENDANT PHYSICAL ADDRESS PHYSICAL UNIT UNIT ERROR COUNT = UNIT OPERATION OWNER UIC 0. = 6946. 1, COUNT [000,000] CHARACTERISTICS DEVICE SLAVE QIO 5-23 = = DEVICE 00 0020 NUMBER CONTROLLER FUNCTION CODE = 0. THE VMS SYSTEM EVENT FILE DDBST NAME 3031300A 24325035 00415358 00000000 /00105P2$XSA0000./ LONGWORD 1. 00000009 LONGWORD 2. 00004091 LONGWORD 3. 00000000 LONGWORD 4. 00000001 LONGWORD 5. 00000100 LONGWORD 6. 00000000 LONGWORD 7. 00000005 LONGWORD 8. -~ 00000C18 LONGWORD 9. 0000401F LONGWORD 0000002F 10. 5-24 THE VMS Example 11: Asynchronous VAX /VMS Device Attention SYSTEM SYSTEM (Entry Type ERROR REPORT EVENT FILE 098) COMPILED 9-SEP-1985 09:22 PAGE ****************E******** **** 1, ENTRY ERROR SEQUENCE 37. DEVICE ATTENTION CI SUB-SYSTEM, 24-JUN-1985 CNFGR - SERIAL¥ 624. PORT ERROR BIT(S) RESTARTED, 50. OF 50. MFG PLANT 15. SET RETRIES REMAINING 00100038 ADAPTER PMCSR SID 0405F270 09:10:59.99 REV# 5. _F$PAAO: PORT WILL BE **kkkhkkkhhkhhhkrhhhhhhhh ik hh LOGGED ON KA86 IS "CI" COMMAND TRANSMIT TIMEOUT 0000004C MAINTENANCE MAINTENANCE PSR 00000001 PFAR 80F89DBC PESR PPR 32 UCBSB_ERTMAX 32 UCBSL_CHAR INTERRUPT ENABLE INTERRUPT FLAG PROGRAMMABLE STARTING ADDRESS RESPONSE QUEUE AVAILABLE 00000000 03F80007 UCB$B_ERTCNT 0C450000 50. RETRIES REMAINING 50. RETRIES ALLOWABLE SHARABLE AVAILABLE ERROR LOGGING UCBSW_STS CAPABLE OF INPUT CAPABLE OF OUTPUT 0810 ONLINE UCB$W_ERRCNT SOFTWARE VALID 0001 l. 5-25 1. ERRORS THIS UNIT VMS Example SYSTEM 12: EVENT Unknown FILE Entry VAX/VMS i (Entry Type SYSTEM khkdhhkhkhhkhhkhhkhkhkhkhkhkhkhhkhkhkhkkkhkx ERROR SEQUENCE "UNKNOWN ERROR LOG 273) COMPILED ERROR REPORT 1. ENTRY 4-FEB-1985 KA86 REV# PAGE LOGGED ON SID MFG PLANT 15. O4FFFFFF 16:32:11.75 255, SERIAL# 4095. RECORD ERF$L_SID 04FFFFFF ERLSW_ENTRY SYSTEM ID 46F32860 ERL$GL_SEQUENCE 008D8214 0053 LONGWORD 0000005A ERROR ENTRY 64 VARV 5-26 TYPE BIT TIME WHEN UNIQUE 1. REGISTER 0111 EXE$GQ SYSTIME 1. *RAAKARRKRRRR AR AR AR ARk k& 83. ENTRY" 6-SEP-1985 13:44 ERROR LOGGED ERROR SEQUENCE 83. ) THE THE VMS SYSTEM EVENT FILE Example 13: The qualifier. Refer rejected entries. RERF-I-INPUT, Example following to the printout 1is the product of the /LOG /REJECT qualifier for an explaination of SYS$SSYSTEM:ERRLOG.SYS, 14: /REGISTER DUMP The following qualifier. 5 selected, printout The cryptic is 12 the format rejected product of (shown below) the can be used to identify control and status bits common to multiple entries. VAX/VMS SYSTEM COMPILED ERROR REPORT 11-SEP-1985 11:33 PAGE 1. CSR CR SR DCR 00000028 00000028 00000028 00000028 00000028 0000007C 0000007C 0000007C 0000007C 0000007C 00000001 00000001 00000001 00000001 00000001 08000028 00000000 0000F86D 08000028 00000000 0000F86D 08000028 00000000 0000F86D 08000028 00000000 0000F86D 08000028 00000000 0000F86D Example 15: The following printout is FUBAR FMER the product of the /STATISTICS qualifier. VAX/VMS SYSTEM ERROR REPORT COMPILED 11-SEP-1985 PAGE PROGRAM RUNTIME STATISTICS TIMES IN SECONDS PAGE CPU ELAPSED FAULTS 1.3 4.4 150 DIRECT BUFFERED 17 7 I/0 I/0 09: 13 1. THE VMS SYSTEM Example 16: qualifier. Report. EVENT FILE The following printout Specifically this is | VAX /VMS is a the product of the /SUMMARIZE sample of the Device Summary SYSTEM ERROR REPORT | DEVICE ROLLUP DEVICE _HSC003$DUA1: _HSCO03$DUA2: COMPILED 13~SEP~1982 09:§2 | LOGGED BY SID -~ QIO TIMEOUT ERRORS THIS QIOS THIS SESSION SESSION HSCO03$DUAS: - _HSCO002$DUA2: _HSCO002$DUA3: _HSCO002$DUA4: - HSC002$DUAS: _HSC002$DUAS8: | _HSCO002$DUA9: i - [SOFT] | [HARD] [SOFT] 3. 0. 0. 0. 0. 7. 0. 0. 0. 0. 4, 0. 0. 0. 0. — 0. 22, 0. 0. 31. 23152. . 0. 0. 4, 10321. -, 0. | _ - e 0. . 0. . ” _HSC003$DUA4: | | HSCO002S$DUALl: -~ . TM ERROR BITS [HARD] | PAGE 0405F270 SET _HSCO03$DUAO: - -~ | ~ | | > | 0. 8. 0. 2. 0. 0. 0. 0. 0. 1 TM 0. 0. 0. 0° —~ | W’“ 0. 1. 0. 0. 0. —~ 2563, ~ 0. 0. TM 0. 0. 5 00 0« Mk 0. 0. 0. | 0. 2. 0. 0. 7. 0. 0. 0. 11. 0. 0. 00 3‘ 00 00 0. 2. | 0. 0. 5. 0. | 0. 1. | - _HSC002$DUA10: _HSC002$DUA11: v HSC002$DUA12: - _HSCO002$DUA13: | 0. 0. | 0. 0. | — _F$PAAOQ: _FSLCAO: _ n - . 8. 6. 0. 0. 4. 0. 0. 0. 5-28 | 1. 0. 1. 0. “ . ‘ THE Example 17: qualifier. Report. The following printout Specifically this is VAX /VMS SYSTEM is a VMS SYSTEM EVENT FILE the product of the /SUMMARIZE sample of the Volume Summary ERROR REPORT COMPILED 13-SEP-1985 0.:42 PAGE VOLUME LABEL(S) LOGGED BY SID 0405F270 QIO(S) ERROR(S) MOUNT(S) LABEL _CsAl: LABEL _CsaAl: 19. 28. 0. 2. 17547. 0. 1. 96. 0. 24. SCRATCH _DUA6: LABEL 0. Exchange _CSAl: LABEL 273. VAX console 2. THE VMS SYSTEM Example 18: qualifier. EVENT The FILE following Specifically VAX /VMS printout this SYSTEM is is the a sample product of the of the /SUMMARIZE Entry Summary Report. COMPILED 13-SEP-1985 09:42 ERROR REPORT PAGE SUMMARY OF ALL MACHINE UBA SBIA ENTRIES LOGGED BY SID CHECK 5. INTERRUPT 3. ERROR 1. CPU ERROR HALT SYSTEM START-UP 24. 34. ERRLOG.SYS CREATED 1. FATAL BUGCHECK 21. TIME-STAMP 106. VOLUME MOUNT 608. VOLUME DISMOUNT ATTENTION 215. DEVICE 8. 20. 92. 1. ERLSLOGSTATUS ERLSLOGMESSAGE ERLSLOGMSCP - 0405F270 UNKNOWN ENTRY TYPE DATE OF EARLIEST DATE OF LATEST 4, ENTRY 16-JUN-1985 9-JUL-1985 ENTRY 5-30 23:12:17.38 22:28:29.81 3. THE The following printout Specifically this is Example 19: qualifier. VMS SYSTEM EVENT FILE is the product of the /SUMMARIZE a sample of the Histogram Summary Report. VAX /VMS SYSTEM ERROR REPORT COMPILED 13-SEP-1985 09: 42 4. PAGE PROCESSED ENTRIES 00:00 01:00 02:00 03:00 04:00 05:00 06:00 07:00 08:00 09:00 10:00 11:00 12:00 13:00 14:00 15:00 16:00 17:00 18:00 19:00 20:00 21:00 22:00 23:00 HOUR-OF-DAY HISTOGRAM LOGGED BY SID 0405F270 9. 34. 23. %%k kkkkkkk 51. 46. 4. 10. khkhkkhkhkhkhkhkhkhkhkhkhkhkhkhkhkkhkhkhkhhkhkhhkhkhkhkhkhkhkhkhkkhkhkhkhkkkkkhkkkkk %okd ok dkkkdkhkddkdhhkdkdkdhkdkdkhkdkdkkdhkdkdhdkhkkkkkkkk khkkhkkhkhkhkhkhkkhkhkhhkkhkhkkhkkk khkkhkkkhkhkhkhkkhkhhkhkhhkhkhkhkhkhkhkhkhkhhkhhkhkhhhhkhhkhkhkhkhkhkhkik * %k k% khkhkhkkkkhkkk 32. khkkhkhkhkkhkhkhkhhkhkhkhhhkhkhhhkhkhkkihkkkkikk 34. khkhkkhkkhkhkhkkhkhkhkhhkhhhkhkhkhkhkhkkhkhkkhkhkhkkikx 123. khhkkhkhkhkhkhkkhkkhhhkhhkhhkhkhhhkhkhkhkhkhkhkhkhkhkhkhhhkhkhkkdhkhkdkkhkikkkhkikk ' 36. hkkkhkkhkhkhkhkhkhkhkhkdhhhhkhkhhkhkhkhkhkhkhhkhhdhhkikikk 28. khkkhkkhkhkhkhkhkhkhkhkhhhhkhkhhkhkdhkihkik 55. ************************************#************* 54. khkkhkhkhkhhkhkhkhkhhhkhhkhkhkhkhkhkhkhkhkhkhkhkhkhhhhkhkhkhhhkhkhhkhbhkhkhkhkkk 65. khkkhkhkkhkhkhhkhkhkhkhkhkhkhkhhkhkhhkhhbhkhkhbhkhkhkhhkhhhhhkhhhbhhhdhkhkhkkkhki 63. 71. 60. 41. 110. 98. 23, 67. 6. LER R R SRR R AR RS RS R 2 2R R R R AR R R R R 2R R XX RN R R khkkkhkhkhkhkhkhkhkhkhhkhhkhkhkhkhhkhhkhkhhkkhkhkhkhkhkhkhkhkkhhhkhhhkhkhkkhkik LA S EE RS RS E R R R R Rt R XA 2 R R R AR XS RX R D X X ***************************************** LE R EE R SRS R LSRR RRRR RS Rt RRR RS R R R R R X R R R R R khkkkhkhkhkhkhkhkhkhkhkhkhhkhhkhkhhkhhkhkhhkhkhkhkhhhkhkhkhkrkhrrhkrkhkhrhkh ki % gk de ok ok dk ok ok ok ok ok ok dkokokdkokdkkkkkk LA R A A SRR RS RS RERRRERRSRRRR R R R R R R R R XX R RN k %k kkk*h APPENDIX A ERROR HANDLING MICROCODE This Appendix contains describe captures how the the state condition, control to a set Error of the (EHM) of Handling CPU, FLOW CHART flow charts Microcode clears the that (EHM) error rolls back the RLOG and PCs, and passes the VMS Machine Check Handler (MCHK) . The notes following the flow charts explain what is happening at each block in the flow. The of flow charts and the EHM, first RL02 pack Rev, 2.1. notes represent the first revision shipped with VAX 8600/8650 Console ERROR HANDLING MICROCODE (EHM) FLOW CHART PRI O VECTOR 8 EBOX ERROR OR MBOX FE STOP IBOX REF | J y -~ EHSR <06> = 1 (EHM ENTERED) 2 i —————<STATE <07> =1 Y > l N 2 PUSH VMQ l | SC «8 | VMQ <« EBCS l v RETURN O 2 ) LOOP AT puPC 21 KAF -] PRI 372 VECTOR 2 T ( OX ngawm ) FBOX WRITE 3 PROBLEM PRI 3/4 VECTOR 4 PRI 3/6 VECTOR 6 VECTOR 11 | | ‘ INTERRUPT 3 MEAR FULL 4 f [ STOP IBOX REF I l STOP IBOX REF 6 Y N >._.__.. IRD TIME IRD TIME 6 3 r ASSEMBLE FBXERR FROM FBOX ASSEMBLE FBXERR FROM REG 1.2, &3 FBOXREG1.23 3 r EHSR <04> <1 | (FBOX SERVICE REQUEST) ESC12 v | ) | ¥ ' STOP IBOX REF | |, STOP IBOX REF 4 )6 | 4 l EHSR <04> <1 EBCS<13> <« 1 (FBOX SERVICE REQUEST) -2 | | ESC 12 €11 (VECTOR) 3 , CLEAR EBOX (CLEAR IBOX PORT ERROR) STATUS UNWIND PCs AND RLOG 3 4 “ fgfi‘?’rfl,@%‘; <! TRAP TO 4) " | 4 ac 6 | 3A 48 MROD387-0569 Figure A-1 Error Handling Microcode Flow Chart (pPart 1 of 20) ERROR HANDLING MICROCODE 1 VECTOR 18 OP PORT IBUF PORT IBOX SYNC 1 1 : , w | STOP IBOX REF SC « 10 VECTOR IE STOP IBOX REF SC « 18 VECTOR IF RLOG UNWIND | STOP IBOX REF SC €= IE 9 10 \ FLOW CHART PRI 2 PRI 2 PRI 4 sgi:‘roa 10 (EHM) STOP IBOX REF SC «—IF 11 12 L | CLEAR EBOX STATUS 13 ) _L< IBESR<27> = 1\ (RAF) N 13 d 4 BESR <26:21> # 0 (ANY IBOX ERR) \ N BE 13 RAF ESC 12 «IC (RAF VECTOR) 13 EBCS <13> &0 (CLEAR I1BOX ERROR RET) 13 IE.CLR.TP. NO.PARA 13 GO HANDLE RAF 40 40 MROJIB7-0545 Figure A-1 Error Handling Microcode Flow Chart (Part 2 of 20) ERROR HANDLING MICROCODE (EHM) FLOW CHART Y / EHSR <26> = 1 \N \ CLEAR MEAR (MBOX REG 7C) Y (MEAR SAVED) EHSR <26> <0 ‘ i EHSR <26> < 1 (SET MEAR P SAVED) MBOX REG N (INHIB ECC CORR) / CLR MEAR SAVED /S 8 MEAR SAVE < MEAR (ESC DB <~ MEAR) : o | = . CLEAR MEAR 7 | K , (MBOX REG 7C) / 8 CALL EHM. MBOX.RESET o » BOX.RESE _ CALL EHM, 7 BUILD.MBOX.SF 8 ( RETURN g ) —~ 7 ( RreTumn )a o~ IE.ROLL.BACK AND RLOG | UNWIND PCs CALL EHM. 7 ( FORK AT AFORK ) . o — HEDDEN. OPBUS.PE 8 RETURN ERROR . | RETURN NO ERROR 8 8 . - ) > Y / ESC 25 (MSTAT1) <23> OR <22> = 1 \ (cPRPE) | Figure A-1 Error Handling Microcode Flow Chart \N 8 - | 20) TM | — v MRO387-0544 (pPart 3 of ~ ERROR HANDLING MICROCODE Y/CSLINT <20> = o\ \ ~ (EHM) FLOW CHART M (EXTINT) 4 Y / CSUINT <19:16> =0(RL=0) CONTINUE MACRO FLOW, N [, /CSUNT <275 =1 "“"< HANDLE /0 SPURIOUS INTERRUPT ) Y [EHSR <07> < SET MBOX MBOX INT) 1 HANDLE PENDING ADAPTER | SERVICE REQUEST NTERRUPT ‘ INTERNAL INTERRUPT 5 EBCS<02:01> «0 CLR MEM/GPR WRT & 1/0 RD FLAGS | 5 ESC 12 « 6 (VECTOR) CLR STATE MOD | 5 ) EHM ENTRY. VECTOR 8.30 ESC 30 < FBXERR ESC 31 < ESC 2F (PC) 14 SC « ESC 12 (VECTOR) ‘4 EHM.DOUBLE.ENT.CHECK [ ESC 16 <« SC Y ; /ensr<oe> =1 \" (EHM ENTERED) ) | 15 | | CSM.STATUS <15:08> « ESC 16 (ESC CO <~ VECTOR) |, ) CSM.ENTRY. DE CLR EBOX STATUS 16 EHSR <23:16> «-SC (CSM (TRAP VECTOR) ENTRY CODE = 5) pagk A ) EHM.PREPROC | [csm sTaTUS <07:00> < 05 15 s ] 115 16 SET EHM ENTERED EHSR <06> <= 1 16 SA 15 MRO387-0546 Figure A-1 Error Handling Microcode A-5 Flow Chart (Part 4 of 20) ERROR HANDLING MICROCODE (EHM) FLOW CHART O EHM. RESET.EBOX EBCS <« 100 (CLEAR EBOX BUILD CONSTANTS ERRORS) J EHM.CHECK.STACK _ESC 10 THROUGH ESC2F < FFFFFFFF | 4 ESC 17 <— 58 (BYT CNT) 18 SET UP TO READ SCRATCHPAD STACK (BANKF) 16 “ CALL EHM. TEST.PE |, o “ 19 ) ESC 19 (EVMQSAV) | < POP (VMQ) RETURN. 18 NO.ERROR RETURN. ERROR /s r ESC IA < EBCS BAD ESC LOCATION <FFFFFFFF 16 19 , |1g ESCIA <19:16> €« 0 16 { 1 Y N ESCIA <09> = 1 (EBOX DP PE) SPADR € SPADR + 1 17 r f ESCIB «— 0 ESCIB « 0 17 ~ (EDPSR) 17 | Y i ASSEMBLE ESC 18 (EDPSRY) P ALL 16 LOCATIONS IN \ N BANK CHECKED /. \ ESC IE (EBOX WD1) 17 <~ POP ESC STACK 20 | ESC IF (EBOX WD2) l < POP ESC STACK ESC IC € CSLINT 18 20 ! ASSEMBLE IBESR ESCID <15:00> < EDMS l ESC 2F « PSL ]20 ESCID <31:16> < |BE 18 ESC 20 (IVASAV) < |VA 20 SET ESC ID (IBESR) <02:00> FOR ESA, ISA, OR CPC VALID 6A MRO387-05489 MHKU3B7-0547 Figure A-1 Error Handling Microcode Flow Chart (Part S of 20) ERROR HANDLING ESC 21 (VIBASAV) MICROCODE (EHM) FLOW CHART l < VIBA | 20 ) ESC 22 (ESASAV) % ESA ‘ 20 1 ESC 23 (ISASAV) I <« ISA ESC 24 (CPC) I <« CPC v ESC 2C (FBXERR) )i :21 Y 21 20 N EHSR<04> = 1 (FBOX INT) l < ESC 30 2 N ” EHSR <07> = 1 (MBOX INT) | f 21 ) ESC 2E (PC) «—ESC 31 (RETURN PC) lZ? i | CALL EHM. BUILD. MBOX SF j 21 RETUR i l CALL EHM, MBOX.RESET < RETURN 21 } 21 i SETUPTO TEST ESC BANK O 22 MRO387-0548 Figure A-1 Error Handling Microcode Flow Chart (Part 6 of 20) ERROR HANDLING MICROCODE (EHM) FLOW CHART 7A \ CALL EHM. 99 TEST.PE RETURN. RETURN. ERROR NO.ERROR / 22 ¥ BAD GPR REWRITE GPR |LOCATION TO INSURE 1/F BOXES OK LOCATION | SET UP TO TEST REMAINING ESC LOCATIONS ] | ALL 16 N LOCATIONS IN BANK CHECKED 22 <— FFFFFFFF 22 { Y 29 EHSR<25> «—1 (PROCESS ABORT) 0o 22 | 5<19:16> € 0001 (PROCESS |_ABORT CODE) STARTING WITH BANK 3 22 23 \ SPADR «— SPADR + 1 | CALL EHM, TEST.PE RETURN. NQO.ERROR ¥ ! SPADR <= SPAD | a Y ‘ f BLOCK 3 THROUGH E CHECKED v 23 ALL 16 \N BANK CHECKED /23 LOCATIONS IN RETURN. ERROR (SET SCRATCH 23 i EHSR <08> &1 PADR + 1 Y 23 " 23 22 BIT) 23 - N ’s SET UP FOR NEXT BANK 23 8A MRO387-0550 Figure A-1 Error Handling Microcode Flow Chart (Part 7 of 20) ERROR HANDLING Y EBCS<11> =1 (eBox cspe) MICROCODE Y \V 27 ) N IBESR <22> 24 ) L 24 <— 1F (IDRAM PEORIBOXCSPE) 27 | : ' » EBCS <08> = 1 (WBUS PE) (IDRAM PE OR EHSR <28> OR <275] 'BOX CS PE) RETURN Y N ""< OR <21> = 1 >_"~ | CALLEEHM. READ.(CSES ( ‘ EBCS <13> =1 (IBOX ERR) [/, Yy FLOW CHART (EHM) CALL EHM. N CS.CORRECTION 27 /. ¥ I EBCS <09> <« 1 | (EDP PE) ! | EHSR <25> &1 (SET PROCESS ABORT) EBCS <13> «0 {CLR IBOX ERROR) 27 25 ' EBCS <19:16> 0010 (PROCESS ABORT CODE = 2 25 . \el - ESC 2F (PC) <30> ) IBESR <23> { =1 (IAMUX PE) 28 <« 0 (CLEAR TRACE PENDING) ‘ Y / IBESR <28> Y 125 \ N \ANO <23> =1 >26 IBESR <28> N LECTED SELECTED) / _ =1 (WBUS ‘ EHSR <00> <1 (IBOX GPR PE) IBESR <28> « 0O : 28 26 4 Y IBESR <24> = 1 (RLOG PE) / N L 29 ' MRO387-0551 i' EBCS <19:16> « 0110 (PROCESS ABORT CODE = 6) Y / EHSR <07> = 1 29 {(MBOX SERVICE REQUFST) pa i EHSR <25> <1 (SET PROCESS ABORT) 30 UNWIND PCs AND RLOG | 30 29 } I 9A l 98 MHROJB7-0552 Figure A-1 Error Handling Microcode Flow Chart (Part 8 of 20) Py ERROR HANDLING MICROCODE (EHM) FLOW CHART IBESR <02 00> i = 0 (ALL PCs INVALID) ‘ EHSR <25> <1 (SET PROCESS ABORT) 30 ERCS <19:16> «- 0011 (PROCESS ABORT CODE = 3) 30 EHM.FBOX EHSR <04> = 1\N (FBOX INT) Y /EBXeRR <17> = 1 (FBOX GPRPE) \N /31 ESC 2C (FBXERR) ) & FFFFFFFF 31 i EHSR <03> <1 (FBOX GPR PE) RESET FBOX AND INITIALIZE 31 | — FBOX TEMPS FBXERR <21> \ | EHM.MBOX.FIXUP 20> OR <19> = 1)33 CALL EHM. FBOX CS OR FDRAM PE EHSR <31:29> < FBXERR <21:19> {FBOX CSPE FLAGS) HIDDEN. OPBUS.PE 34 32 t RETURN NO ERROR CALL EHM. Cs. CORRECTION ‘ K} RETURN /34 ERROR /4, 32 RETURN EHSR <25> &1 (SET PROCESS ’ 32 e | ABORT) 34 s, | EHSR <09> €~ 1 RESET FBOX AND (85‘;%" SCRATCH INITIALIZE FBOX TEMPS Y 32 ( il N \, 34 1 EBCS <19:16> <« 0100 ABORT CODE = 4) N (PROCESS 33 u| \ | eBcs <02> <=0 CLR MEM WRITE % 1 /MSTAT1 <27:22> \\* 0 (CPR PE) 33 N 35 oo EHSR <25> < 1 o) SET PROCESS ABORT I 35 EBCS <19:16> <« 0101 (PROCESS ABORT CODE = 5) 25 10A MRO3IB 70553 Figure A-1 Error Handling Microcode A-10 Flow Chart (Part 9 of 20) ERROR HANDLING Y MICROCODE (EHM) FLOW CHART N ‘ <EASTAT1 <1 1:08>> + 0 (TB PE ( ) 36 EHSR <09> <« 1 SET SCRATCH BIT 36 CLEAR ENTIRETB 36 MSTAT2 <06:05> = 0 v ! IS DMA OFF 37 CACHE TAG PE OR CACHE WBIT PE N Y / EHSR<09> =1 \ N - (SCRATCH BIT SET) TURN DMA | OFF ; aa (MISC MBOX ERROR) 37 | 11A ! 12A ‘13/&' MRO387-0554 Figure A-1 Error Handling Microcode A-11 Flow Chart (Part 10 of 20) 'ERROR HANDLING MICROCODE (EHM) FLOW CHART Y N IS CACHE \ IN ERROR ALSO ENABLED 37 (MSTAT1<02> = CSHCTL<01:00>) SWEEP CACHE BLOCK FOR CACHE NOT IN ERROR Y | 37 ’ MSTAT2<06> = 1 CACHE TAG PE N 37 WBIT PE SWEEP CACHE BLOCK FOR CACHE CAUSING ERROR CLEAR CACHE BLOCK FOR CACHE CAUSING ERROR |45 |37 1 CLEAR EBOX STATUS - CLEAR MBOX ERROR REGS 37 Ly, r TURN CACHES ON AS THEY WERE TO BEGIN WITH Y ! WAS DMA 37 N ORIGINALLY ) TURN DMA ON 37 12A MRO387-055% Figure A-1 Error Handling Microcode A-12 Flow Chart (Part 11 of 20) ERROR HANDLING MICROCODE (EHM) FLOW CHART EBCS<23:21> € 1 (STACK FRAME REV) |3g 1 EHMSTS <« EHSR (ROTATED - | TM Y 4 y | y EHSR <08> =1 \ UNCORRECTABLE /g ESCERROR N EHSR<23:16> \ = 8 (uTRAP VECTOR = 8) /39 uPC = 20 EHM.A&B.PE SUNSET LOOP f 39 y CALL EHM. POST PROC ‘ RETURN A ’ 41 RETURN B 40 42 | RETURNC VMS CSM IS EXIT ENTERED s SET ACTIVE HSR <05> <1 EHS ‘ TUS < POP CSM.STA EHSR < 0 43 REGULAR (SET VMS a1 | (CSMSTATUS <5, m ENTERED) CSM.ENTRY CODE) s PC <« ESASAV (ESC 2E <~ ESC22) CSM.ERROR a1 < CSM.ENTRY,DE) 42 PSL <27> €1 (SET FPD) |,, ( IE.EHM. ENTRY a3 ) 43 MRO3B2-0557 Figure A-1 Error Handling Microcode Flow Chart A-13 (Part 12 of 20) ERROR HANDLING MICROCODE Y (EHM) ANY OF N 07:03.00> =1 a4 MSTAT1<23:19, FLOW CHART (ANY OF THESE ERRORS) Y/ ANY OF MSTAT2<14:03, 02> = ! (ANY OF THESE >’i—— 44 ERRORS) Y/ i N ANY OF MDECC<22, (ANY OF THESE ERRORS) Y Y —_—< N EHSR <07> =1 (MBOX INT) a4 12A N (SBE) MDECC<21> =1 | >__“ 44 Y y < MSTAT1 <29:26)> \ N =9 (CYCLE TYPE = CP REFILL) v >—“"‘ (CYCLE TYPE = e =F ABUS REFILL) FIXUP N /MSTAT: <29:26> \ EHM.BIT38. 45 45 P 12A MDECC<27> & 1 (SET SBE) 45 r | MDECC <14:09> « 0 (CLEAR SYNDROME) 45 EHM.SBE.FIXUP GENERATE SBE ERROR ADDRESS FROM MEAR<29:04> | AND MSTAT1<25:24> 46 14A MRO3B87-05568 Figure A-1 Error Handling Microcode A-14 Flow Chart (Part 13 of 20) ERROR HANDLING MICROCODE (EHM) FLOW CHART READ CACHE PHYSICAL 46 P ! WRITE CACHE l PHYSICAL 46 ] CALL EHM. MBOX.RESET 46 RETURN l 46 EHMSTS <« EHSR (ROTATED RIGHT 8) 47 CALL EHM. POST PROC ‘ RETURN A 2 } ‘ | 48 RETURN B ’42 ‘ 41 EHSR <0 , 49 I ; RETURN C CSM.STATUS < POP 41 (CSM.STATUS <5, CSM ENTRY CODE) EHSR <05> <1 (CLEAR EHM.VMS ENTERED) 42 Jag ) PC < ESASAV (ESC 2E < ESC 22) Y 42 PSL<20:16> \ N <ID {IPL <ID) 49 ) CSM.ERROR 41 CSM.ENTRY.DE 42 TRAP VMS VIA SCB VECTOR 54 (START AT NEW PC) s 9 CONTINUE MACRO FLOW/49 MRO387-0558 Figure A-1 Error Handling Microcode Flow Chart A-15 (Part 14 of 20) ERROR HANDLING MICROCODE (EHM) EHM.CS. CORRECTION ' 50 FLOW CHART EHM.READ. CSES % EBCS<31:27> <« EHSR<31:27> (REQ CSL SERVICE) N 1 RBUF<07> =1 (DONE SET) | | B! RBUF<07> <« 0 (CLR 51 AEAD CS CORR DONE) INFORMATION FROM RBUF r CALL EHM. READ.CSES | | ESC 2D (CSES) <--CONSOLE CORR INFO . | ( retuan ) | | ESC A (EBCS) <— 0 (CLA CSL | ERR CORR REQ) RETURN v ~ | MROIB 70558 16 EBOX \N ELAPSED / CYCLES ! , RBUFC <07> « 0 (CLEAR DONE) ) RETURN MRO3IB7-0560 Figure A-1 Error Handling Microcode A-16 Flow Chart (Part 15 of 20) ERROR HANDLING MICROCODE (EHM) FLOW CHART \ EHM.TST. PE 52 ) READ GPR/ESC LOCATION THROUGH BOTH AMUX AND RMUX N Y( A RAM PE > ———\-'< EBCS « 100 (CLEAR EBOX N B RAM PE | >-——-— , Y< EBCS « 100 (CLEAR EBOX ERRORS) ERRORS) ) ESC DA (EHSR) <02:01> €11 | {GPR ABB PE) <23:16> «=ESC PE ADDRESS ] EBCS « 100 (CLEAR EBOX ERRORS) ] ] ESC DA (EHSR) ESC DA (EHSR) <01> €1 <02> €~ 1 (GPR A PE) (GPR B PE) COPY B-SIDE COPY A-SIDE TO A-SIDE TO TO B-SIDE TO | ESC IB (EDPSR) | ' B RAM PE >N ) CORRECT PE CORRECT PE ESC IB (EDPSR) <23:16> <« ESC PE ADDRESS ] RETURN. ERROR RETURN.NO. ERROR MRO3BT-0561 Figure A-1 Error Handling Microcode Flow Chart A-17 (Part 16 of 20) ERROR HANDLING MICROCODE (EHM) FLOW CHART ~ ( EHM.H!DDEM) oPBUSPE | | /.. ~ V< EHSR<24> OR \ <07> =1 (MBOX TRAP TO { Y /5 4 4 OR MBOX INT) mstati 926> =€ \N [/ < <29; = | E (CYCLE TYPE = CP READ) Y MSTAT2<14> = N Y | MSTAT1 1 <29:26> (ABUS BAD (CYCLE TYPE DATA CODE) Y/ wmstaticar, 19.0R 18) = 1 (BDF, DBE, \' < MSTATI<03> = 1 \N 19,0R18) =1 / 57 (BDF, DBE, 56 | /5 6 = CP RFEFILL) MSTAT1<21, \ OR APE) | ] =9 | Y . [/ wsTar \ <15> = 1 > (CACHE DATA | — OR APE) N 57 B (BYTE WRITE) PE) T N - RETURN. ERROR RETURN. MSTAT1<25:24> = MEAR NO.ERROR <03:02> (ERROR ON REQUESTED LW) RETURN. - . 57 - —~ r ; RETURN. ERROR NO.ERROR EHM. TM MRO387-0562 POST PROC / 58 ESC 34 (CSM.TO) < ESC CO (CSM. ‘ STATUS) | | -~ PSL <26> = 1 | (ON INT N - ESC D8 (EHM.SP) ESC D8 (EHM.SP) (SAVE SP) (SAVE ISP) <« SP <« ISP ~ ESC CO (CSM. sm*us <07 00> ACTWE RETURN.A < | . EHSR <05> N > N - | (VMS ENTERED SET) ( meturne ) ~ ( ReTurnC ) MROJ8 70563 Figure A-1 Error Handling Microcode A-18 o Flow Chart (Part 17 of 20) ~ ERROR HANDLING EHM.BUILD. _MBOX.SF MICROCODE (EHM) FLOW CHART EHM.MBOX. RESET 59 60 l MBOX REG 18 ESC 25 <« MSTAT1 N <04> = 1 (MBOX REGS {DMA DIS) 2C.28.24.20) i ) ESC 26 <— MSTAT2 MBOX REG 18 (MBOX REGS 5C.58) <04> &1 (DISABLE DMS) o | ESC 27 < MDECC (MBOX REGS | 70.60.50) MBOX REG 10 <01> «- QO (CLR INV CHK EN) | A o %m SC 28 «— MERG 1BOX REGS MBOX REG 14 8.14) <04:00> « 0 (CL R GEN EVEN PAR) ESC 29 <« CSHCTL MBOX REG 18 <03> « 0 (CLR (MBOX REG 04) CMD BAD PAR) - EHSR <26> \ i = | CLR MBOX ERROR REGISTERS (MEAR SAVED) ! ESC 2A «- ESC DB ESC 2A «— MEAR (MBOX REG 7C) (MEAR <~ MEAR.SAV) __1< DID WE JUST TURN DMAS OFF >i__ MBOX REG 18 <04> «-0 ESC 2B «— MEDR {MBOX REG 78) (ENABLE DMA) EXTRACT PA <29:20> FROM ESC 2A (MEAR) ( i RETURN > MRO387-0565 R i 'READ PAMM AT PA<29:20> (MBOX REG 54) ESC 26 (MSTAT2) <20:16> € PAMM CONFIG ! ] : RETURN ) MROJIB7-0564 Figure A-1 Error Handling Microcode Flow Chart A-19 (Part 18 of 20) ERROR HANDLING MICROCODE (EHM) FLOW CHART ' IE.EHM. I l PSL <23:22> <« PSL <25:24> £ PSL CURR MODE) . VMQ <« SCBB + 4 | READ CACHE PHY ESC 13 «-EMD (READ NEW PC) = | 00 ) ' ?xl&f ?,?ic.(,’ = l 10 ) =l1 GO TO CSM.HALT THEN USER MODE ' > ESC 13 (NEW PC) <01:00> = < Y =101 ' <26> =1 PSL {INT STACK) ;, TM | N | CON_WAIT__ LOOP = . SP «- ESCE4 (SP «— ISP) . PUSH PUSH (PUSH OLD PC) (T PUSH ESC 17 (STACK FRAME) ENABLE IBOX AT NEW PC MR03B7-0566 Figure A-1 Error Handling Microcode Flow Chart - A-20 (Part 19 of 20) T —~ ERROR HANDLING CSM.ENTRY. [ DE . MICROCODE CSM.ERROR Y FLOW CHART 62 63 STOP IBOX REF ESC Cl «~VMQ (EHM) CSM N ENTRY CODE =186 - r Y / XBUFC <07> FIND 64KB OR ESC C4 «— EMD ESCC3 « PSL FIND RPB ERROR, N = 1 (READY SET) GO TO THE ERROR RESTART ] XBUFC < 4 ESCC2 <11:04> (HARDWARE | ERROR ABORT) <« SC ESC C2 <03:00> <- SPADR XBUFC <07> « 0 (CLR READY) ] CSM.CON__ WAIT__LOOP i CSM.CON__ WAIT__LOOP MR0387-0568 MRO387-0567 Figure A-1 Error Handling Microcode Flow Chart A-21 (Part 20 of 20) ERROR HANDLING NOTE 1 MICROCODE The entry MBox (EHM) at fatal FLOW CHART trap vector errors, 8 both is assigned at main to EBox priority 0, errors and sub-priority 0, and TB parity errors associated with EBox requests at main priority 0, sub-priority 1. These errors have the highest priority because they could affect the instruction the EBox 1is executing, which in turn may affect the ability of the system to retry the operation. This NOTE 2 is also the entry point are detected during The EHM stops IBox ENTERED (EHSR ESC for parity references <06>) 1is ESC parity errors that checking. and checks set. (EHM it to see is, STATE if EHM is checked. 1If set, the EHM simply does a RETURN [0], which will return the EHM to the microinstruction after the one that incurred the parity error for parity errors). If is checking To check the GPR/ESC for parity errors, it to examine each ESC 1location on both B-side. This is carried out by EHM, after is the <07> the ESC necessary and ENTERED is A-side EHM set., This code also sets STATE <07>, to indicate that ESC parity error checking is in progress. 1In . the event that there cause a are trap parity to ENTERED and STATE <07> for ESC parity errors. will wuse which side the of One pitfall requires reading 8. Therefore, to the 1is, encountered while tight loop of 008 ESC see if If so, status latched reading revision errors, vector had the we parity to check ESC DA (EHSR). the - location we were in the parity in the PDP that reading O02F the will check fact EHM checking checking code MCA to indicate error. for EHM If a ENTERED, parity EHSR, the EHM will 008 O02F, etc. it error is be in a (for EHM 1). If we are not testing is not set and it second time through a EBCS to the VMQ loop at uPC 21. for ESC parity errors, STATE 1is an EBox double entry error trap to 8). EHM will transfer (visible on the If EHM entered is not set, the loaded with the vector, 8, and EVA) and go into a <07> (the the sunset shift counter will be EHM will join the common flow. NOTE 3 The entry at trap vector 2 (main priority 3, sub-priority 2) does not go directly to EHM, but to the main flow of EBox microcode. This code will stop 1IBox references, assemble EHSR FBXERR <04> to request. control and trap The is 6 from FBox indicate vector, passed to registers that EHM there 2, 1is at the 2, 1is an 1loaded entry (EHM.ENTRY.VECTOR8.30). A-22 1, and into point 3, then set ESC 12, then FBox for service trap 2 ERROR HANDLING MICROCODE (EHM) FLOW CHART arise which the an FBox operation, a problem may puring FBox asserts FBOX FBox cannot process oOn its own. The LEM when it and FBOX WRITE PROB PROBLEM, FBOX ABOR.T, when to get the s fork the EBox drives the WBus the EBOX ws allo h whic destination, it asserts CP SYNC, LEM. If the dest ination trap logic to process FBOX PROB vector is allowed (trap address is for a GPR, the trap to prevent writing bad ted abor be 2), the WBus write will ented from writing prev data into a GPR, and the IBox isretr l of the starting the the ESA (allowing for with the ieva problem) . address of the instruction e address, the EBoOx I1f the destination address is a cach 2 trap logic is prevented from generating a trap tothe ress. Because because an operand write is 1in lt prog cache, it recognizes MBox would be writing the resurtedto . In response to the that FBOX WRITE PROBLEM 1is asse s code of 9, error, the the MBoOx will assert a portto statu 11. which will cause an operand port trap ially handled by The trap to 11, like a trap to 2, isIt init do the same as will the main body of EBox microcode. es, assemble renc refe IBox it does for a trap to 2, stop 1, 2, and 3, set EHSR <04>, FBXERR from FBOX registers 12 with the wvector, 11, ESC load FBox service request, then pass control to EHM at the entry vector for trap 2 and trap 6. NOTE 4 a trap to 6 (main The EBox trap logic will generate IRD time if there is an priority 3, sub-priority. 6) atexce ption and interrupt interrupt pending. The EBox microcode will initially process the interrupt. (WBus It will first stop IBox references, set EBCS <13> and PCs the d unwin register) to clear IBox error, then rupt. inter nal exter an RLOG. 1If CSLINT <20> is set it is IPL, is checked for 0 CSLINT <19:16>, the interrupting a spurious interrupt (zero), and if 0, it must have been executed. I1f the IPL and the next macro instruction is rupt is attended to. inter er adapt is not 0, then the 1/0 NOTE 5 nal interrupt. I1f CSLINT <20> is not set, it is an inter3> to determine The microcode will branch on CSLINT <29:2 An which internal interrupt is the pending interrupt. set. being MBox interrupt is indicated by CSLINT <27> and interrupt For the MBox interrupt, the exception that there is an ate indic microcode will set EHSR <07> to gpr write (mem/ 1> <02:0 EBCS MBox service request, clear 6, then r, vecto the with 12 ESC load and 1/0 read flags), pass control to EHM microcode at EHM.ENTRY.VECTOR8.30. NOTE 6 Address This is the entry point for vector 4, MBox Error ity 3, prior main at full), (MEAR Full Register time. IRD at nized recog sub-priority 4. This trap is A-23 ERROR HANDLING MICROCODE MEAR, the (EHM) FLOW CHART MBox error address offending address for the Whenever the MBox detects an the error tell the eérrors, If the trap error operating is at interrupts, error 6 (interrupt) not a an IPL then the condition. MBox holds detected the error. error condition, it latches various error registers and tries to the error via trap 8 for fatal MBox in the EBox about or register, 1last for fatal non-fatal error, above this and 1D, EBox won't be In errors., the the IPL notified of case, the MBox CPU for the would is MBox MBox not be able to record the address and detai ls of a second error, because the first error has not been logged. To remedy this situation, the trap to 4 will save MEAR if it hasn't already been saved. The EBox EHM microcode will port there has status, been a stop and trap IBox set to 4, references, EHSR then check if MEAR has already been saved. NOTE If 7 EHSR <26> is already set, <24> to clear the indicate that to see EHSR <26> MEAR.SAV is have nowhere to save MEAR. When we the error, the MBox multiple error already full, we finally get to handle bit will be set. So, MEAR will be cleared, the PCs and RLOG will be unwound, and the next macro instruction will be executed., In addition, if ECC correction register 10 <02>, EHSR <26> is the MBox reset (see note 60). NOTE If MEAR has 8 <26>), and MEAR.SAV, stack not been saved, MEAR (MBox DB. Then ESC frame will Call a routine (see note 53). On the to return, detect exist, an then Otherwise NOTE 9 (see note Trap vector an OP-Port EBox read 0). to check 10 for process join the next is write regular macro assigned (main to - - port write associated (main (main with priority or § EHM 1, an 1, to and set copied and the parity 4 (EHSR be to MBox error (CPU failed flow error) (note 15). be executed detected sub-priority port MBox parity priority Op by saved) (CPR IBox errors priority IMD operation OPBus instruction will This trap vector is also assigned are be will condition error) ~ that 7C) hidden abort parity 7). (MEAR MEAR saved will register OPBus the inhibited MEAR will be cleared built (see note 59). be if is cleared 5, TB during 0), parity request sub-priority or an sub-priority during 1) errors an and Op TB parity errors that are associated with an OP port request during an EBox read IMD operation (main priority 5, A-24 TM ERROR HANDLING MICROCODE (EHM) FLOW CHART -, o~ submpriority 1). o, SC with the rences and load the IBOX EHM will stop IBox refethe errors. common flow for vector, 10, then join NOTE 10 ng to IBox errors detected orduri Trap vector 18 is assignedrity EBOX 0), rity prio sub4, an EBox fork (main prio rity 6, sub-priority 0), or read ID operation (main priooperation (main priority 7, an EBox string read sub-priority 0). parity errors This trap vector is also assignedrequtoestsTBduri ng EBOX fork associated with 1IBuffer port 1) and TB pari ty errors (main priority 4, sub-priority ng reads ests during EBox stri associated with OP port requ rity 1). (main priority 7, sub-prio with the rences and load the SC erro EHM will stop IBox refethe rs. IBOX common flow for vector, 18, then join NOTE 11 IBOx errors detected duringn Trap vector 1lE is assigned to(CPC SYNC) operation (mai an IBox flush and load0). priority 2, sub-priority es and load the SC with the EHM will stop IBox referenccomm on flow for IBOX errors. vector, 1E, then join the NOTE 12 to 1Box errors detected during Trap vector 1F is assigned 2, an IBoOX RLOG unwind operation (main priority sub-priority 0). and load the SC with the EHM will stop IBox references on flow for IBox errors. vector, 1F, then join the comm k for an IBox error will clear EBOX status toanduse chec addressing an attempted operand specifier that (res fault, mode ng essi erved addr mode that is not allowed was a RAF. NOTE 13 Any IBESR <27> will be set if there If there is a RAF, IBESR <26:21> is RAF). checked for any If s, any other IBox errors. the non-zero bits, in other word RAF, the with g alon rs, there are any other IBoxX erro error will be attended to (hardware error). This will insure that a hardware problem precedence than a RAF. | will | have higher rs, the RAF vector, 1C, I1f there are no other IBox erro EBCS <13> will be cleared will be loaded into ESCrol12,will be transferred to the (IBox error bit), and cont RAF fault. EBox main microcode to handle the - A-25 ERROR HANDLING NOTE MICROCODE If 14 there (EHM) was interrupt), Up at 15 For contents of PC in ESC 2F contains the errors eéxcept double NOTE 16 problem), FBox ESC an FBOx registers 30 be the for will EBox 1, 2, and to ESC 1loaded the set 3. Tt The 31. ESC into the vector (in sC) 16 to prepare for a possible second error occurred after EHM EHM for the (EHSR error, end FBXERR safekeeping., be errors, (MBox error, transferred vector 6 code will <06>), ESC 16 we will (vector) prepare is loaded into C0) <15:08> and a CSM entry code of 5 error was detected) is 1loaded into <07:00>. ESA is read, then wri tten into ESC CSM.STATUS and For ESC CSM.STATUS (ESC (non-EBox double 2E, (FBox | ENTERED had been CS . wai M t loop. main 4 will the counter. all to will be loaded into double entry. I1f a For to (FBox write problem), transferred which shift NOTE trap 11 R8.30. the be return 12, ga or EHM.ENTRY.VECTO contains will FLOW CHART control is EBox microcode transferred (see note to CSM.ENTRY.DE 63). in the If EHM ENTERED is not set, we join the fir st cammon point for all trap vectors, The EHM follows a com mon path for the remainder of the code, EHM will SC, All clear 1into constants scratch eérror, pad and EBox EHSR status, <23:16>, used by EHM registers ESC locations the ESC load then the set will trap EHSR be vector, <06>, EHM rewritten from ENTERED. into to prevent an 10 through inadvertent 2F (st the parity ack frame) will written with FFFFFF FF. ESC 17 will be loa ded with 58 (Hex), the number of bytes in the stack frame, the contents of the be popped EBCS VMO off will (process NOTE 17 ESC 1A be abort <09> is at stack transferred code) is checked path parity EDPSR is assembled error. 18 time to of written to ESC see if the into 1A, cleared. and error ESC ECS will 19, 1A there SO, ESC from 1B the sjx 4-bit is is an first EBox cleared, registers into ESC 1B. If there ESC 1B is cleared. is be the <19:16> | 1If PDP MCAs and written data path parity err or, NOTE the and data then in the no EBox The CSLINT register is loaded into ESC 1C and the IBESR assembled in ESC 1D; ESC 1D <31:16> from IBE and 1D <15:00> from EDM EsSC S. ESC 1D <02:00> will then be according set to the valid bits for Esa, ISA, or CPC. has to be done at This this time because it is the EBox signal error (if there was an EBox error) that is hol ding the valid bits, and the EBo is x A 100 (Hex) Writing a 1is 1logic then "1" eérrors written A-26 +to bit will to 08 be WBus will cleared shortly. register clear EBCS EBCS, <15>, ERROR HANDLING MICROCODE <12:09>, NOTE 19 and <04>, clearing (EHM) FLOV EBOX errors. The data stack, bank F of ESC, will be checked for pariv, errors (see note 52). Part of the set up is to set STATE <07> to allow for the trap to 8 which will occur for an ESC parity error. During ESC parity error checking, an ESC parity error looks like a second error, but the EHM, upon seeing STATE <07> set will cause the next microinstruction after the microinstruction causing the parity error, to be executed (see note 2) to allow the completion of ESC parity error checking. An error return will occur only error in FFFFFFFF both sides of is written to if there 1is the GPR/scratch pad the failing ESC a parity registers, 1location. and If there 1is a parity error in both sides of the scratch pad stack, the only thing lost is data used by the microcode during the execution of an instruction, and the instruction will be retried. The non-error return will occur if there was error, oOr a parity error on one side only, the good side is copied to the bad side. All 16 there the NOTE NOTE 20 21 locations 1is more information EHM pops off the l1F. in The ESC ESC 24. in the block will be than one parity error, for the no parity in which case » checked, EDPSR will so if contain last parity error. the last two words that the EBox wrote to cache scratch pad stack and stores them in ESC 1E and PSL is stored in ESC 2F, the IVA in ESC 20, VIBA 21, ESA in ESC 22, ISA in ESC 23, and the CPC in | If we are here because of an FBox problem that trap to 2 loaded into caused a (EHSR <04> = 1), then ESC 30 (FBXERR) will be ESC 2C in the stack frame. If it was an MBoOx interrupt, then we will load the return PC, from ESC 31, into ESC 2E. If either 2C or 2E are not written at this time, they will remain all F's. The remainder of the MBox stack frame will this (see NOTE 22 time (see note 60). note 59) be followed by resetting built at the MBox EHM sets up to test ESC bank 0 (GPRs) for parity errors (see note 52). Part of the set up is to set STATE <07> to allow for the trap to 8 which will occur for an ESC parxty error. If there are no errors, or an error in one side mnly, the return is RETURN.NO.ERROR. An errmr return is made only when there is a parity error on both sides. For this case, the bad GPR location is wr1tten with FFFFFFFF, the process abort bit is set (EHSR A-27 ~ ERROR HANDLING MICROCODE (EHM) FLOW CHART <25>), and a process abort code of 1 (unrecoverable GPR parity error) is written to EBCS <19:16>. For an error return, or non-error return, the GPR 1is rewritten to 1insure that the FBox and IBox contain the same NOTE 23 data as the EBox. Once the GPRs have been tested, EHM sets up and checks the remainder of the scratch pad registers, the constants and architectural registers, in banks 3 through E (see note 52). Part of the set up is to set STATE <07> to allow for the trap to 8 which will occur for an ESC parity error. For these banks of scratch pad locations, an unused bit, bit 08, in EHSR will be set if there is a parity error in both sides of the ESC. EHSR 08> will be checked at a later time, and if it 1s set, and if EHM was entered at vector 8 (EBox error or MBox fatal error), will force EHM to NOTE 24 a If sunset an EBox loop at uPC control 20 (see store note parity 39). error caused the microtrap, the parity error has already been corrected by the console. EHM will read the control store error correction information from the CBus interface (CSES) and write it into the stack frame (see note 51). When an EBox control store parity error (CSPE) occurs, the EBox stalls because the parity error prevents clocking the control store data register. The parity error will generate an interrupt to the console, which will correct the parity error. After correcting the control store parity error, the console will use the SDB control channel to generate a "CSPE RESET" in the EBox, which will cause a microtrap to vector 8. NOTE 25 If there was a WBus parity error, there is a possibility that bad data has been written somewhere, so set the process abort bit (EHSR <25>), and load EBCS <19:16> with a process abort code of 2 (WBus parity error). EBCS <09> will also be set to indicate an EBox data path parity error., - NOTE 26 The trace pending trap when the REI (TP) bit is instruction reset to prevent is executed. a trace IBESR <28> indicates whether the IBox selected the WBus (logic 1) or GPR (logic 0) data. If IBESR <23> (IBox AMUX PE) is set, IBESR <28> indicates where the data, and parity error came from, WBus or GPR. But, the VMS error formatter interprets it 1is set, even if IBESR <28> IBESR <23> A-28 as is a WBus clear. parity error if Therefore, EHM ERROR HANDLING MICROCODE (EHM) FLOW CHART will clear IBESR <28> if IBESR <23> is not set. NOTE error or IBoX 1f there is an IBox control store parity cted before corre be must they , error dispatch RAM parity EBCS because it 1is the the IBox error is cleared in the IBox error bit that holds the CSPE error address. 27 y error, If there is an IBox DRAM or control store ty parit r will be erro pari EHM sets EHSR <28> or <27>, and the , is error 1IBox <13>, EBCS corrected (see note 50). cleared after the error has been corrected. NOTE 28 IBoX AMUX EHM checks IBESR <23> to see if there is s anto see if the check then it is, parity error. If there <28> IBESR If GPRs. the or WBus AMUX is enabled for the an is error the and ted selec were GPRs is not set, the If IBESR set. ' IBox GPR parity error, so EHSR <00> is cause parity the d WBus the from data the <28> is set, y in the error. In this case, the problem is most likel WBus had IBox WBus receivers or AMux itself. (1f the d have shoul been the cause of the parity error the EBox detected the error.) NOTE 29 indicated If the IBox error was an RLOG parity error, as EHSR <25>, by IBESR <24>, then we cannot do an unwind. process abort is set, and a 6 is written to EBCS <19:16> to indicate that the process abort was caused by an RLOG | parity error. It there was an RLOG parity error, EHM will now go to the FBox fixup section. NOTE 30 st) 1is Wwith no RLog PE, if EHSR <07> (MBox service reque ol is contr so set, we have already done an unwind, transferred to the FBox fixup section. | I1f there is not an MBox service request, unwind the PC's I1f all of and RLOG, then check the validity of the PC's. <25> EHSR set d, the PC's, ESA, ISA, and CPC, are invali (process abort) and load EBCS <19:16> with a process abort code of 3, (all IBox PC's are invalid). NOTE 31 In the FBox fixup code the first check 1is for an FBox If there was not an FBox <04>. EHSR interrupt, 2C, FBXERR with FFFFFFFF, ESC load interrupt, EHM will then reset the FBox scratch pad registers. and initialize the FBox temporary is made If there was an FBox interrupt, a further check error, as to see if it was caused by an FBox GPR parity indicated by FBXERR <17> being set. If this is the case, | EHSR <03> will be set. - A-29 ERROR HANDLING NOTE 32 MICROCODE FBXERR <21:19> control érror, FBox dispatch the corresponding After the pad bits parity 50). are console the FBox RAM set, bits error to see error, (FDRAM) registers. parity be corrected the initialize was by FBox | FBM parity 1If any transferred to control The the console correction, the an store error. set bits are EHSR <31:29>. completes and there control the in will if FBA (see EHM temporary will scratch EBCS <02>, MEMORY OR GPR WRITE , will be set when the EBox writes to a GPR, which it will do on a floating point instruction when the FBox pass es the data to the EBox. 33 If - there is an FBox error, EBox from set. Therefore, the but EBCS changed, FOor writing string instruction However, not a the MEM string if EBCS MEMORY this first to If the this have bit FBOX enters for the the EBox machine an erroneous is not true, previously should never PROBLEM, and will still has not as the string written be the been bit, set if a GPR. we trap instruction was | (FPD, the 1is PSL <27>) didn't is cleared. MBox case where fixup clear, section. the MBox detect a sent The bad parity data error | EBox didn't detect EBCS the OR GPR WRITE, EHM is the EBox but (see note 53). abort, of first part done point, check get EBCS instruction. Therefore, At the <02> but contains WRITE 2, FBOX PROBLEM will prevent GPR, state could vector <02>, the instructions the through 34 checked parity or reset NOTE are these note FLOW CHART store of Store NOTE (EHM) <25> the OPBus parity error, process will be set, a process abor t code of 4 (EBox failed to detect OPBu s byte parity error) will be written to EBCS <19:16>, and EHSR <09>, an unused bit will be set,. EHSR <09> will be used as a branch condition at a later time (see note 44). NOTE 35 If we have either of MSTAT] <23:22> set, (Cycle Parameter RAM) parity error, process abort bit, EHSR <25>. A process (CPR parity error that did not cause will be written to EBCS <19:16>. NOTE 36 If any of error, will <09>, and clear bits MSTAT1 it is <11:08> necessary to the entire translation an unused bit, will be set. condition at a later time A-30 (see note abort an MBox are set, clear we had and must TB CPR the of 5 error) TB parity location. buffer,. It is used 44). code fatal we had a the a set EHM Also, EHSR as a branch | TM ERROR HANDLING MICROCODE (EHM) FLOW CHART a cache tag parity error Here we check to see if we had error , MSTAT2 <06:05>. 1If NOTE 37 or a cache written bit parity for a short time. DMAs are enabled, they will be disabled d on, neither cache 1f the cache in error is not turne the cache in error is if but will be cleared or swept, turned on, the other cache will be swept. ding cache block For a cache tag parity error, the offen parity error, will be cleared. For a cache written bit then cleared. the offending cache will be swept, EHM then clears EBox status, clears the error MBoOx as they were to begin registers, and turns on the caches are saved, then bits with. (The original cache enablee bits are used to force enabl restored, because the cache a cache sweep or to clear cache.) Dbit DMAs will be enabled at this time if the DMA enable: was originally set. ty error with VMS will bugcheck if we had a cache tagwe pari don't know whose " the written bit set. 1In that case, data got lost. Go to EHM.EXIT to figure we're in and how to exit EHM. NOTE 38 out - what mode has been We will end up here, at EHM.EXIT, unless there<23:2 1> will EBCS SBE). 0 an SBE (unrecorded check bit (The ion. revis frame be loaded with 0001, the stack O stack frame, and this KA8600 was announced witha Rev. is the first revision to EHM.) EHSR (ESC DA) is rotated right 8 to get the bits into correct position, then loaded into EHMSTS (ESC 18). NOTE 39 in both If EHSR <08> is set, there was a parity error location a in sides of the GPR/scratch pad registers, banks ters, regis reserved for constants and architectural 3 through E. I1f, in addition, the microtrap vector is 8 (EHSR <23:16>a = 8, EBox error or MBox fatal error) EHM will go into sunset loop at microPC 20. NOTE 40 error but the processor code 1if cleared and I1f there was an uncorrectable ESC parity vector was not 8, control will be passed to the EHM post processor code, which will figure out what mode we are currently in and how to exit EHM (see note 58). Control will also be passed to the EHM post there was not an uncorrectable ESC parity error. NOTE 41 If CSM was active, ESC DA control passed to (EHSR) will be CSM.ERROR in CSM microcode (see note A-31 "ERROR HANDLING NOTE 42 MICROCODE (EHM) FLOW CHART If VMS ENTERED is set, we have a doubl e ‘The second error occurred while the handler was active. error condition. VMS machine check CSM.STATUS (ESC C0) <07:00> will be loaded with 5, the CSM entry code for a non-EBox double error . The PC will be loaded from ESASAV and control will be passed to CSM.ENTRY.DE in main EBox microcode (see note 63). NOTE 43 This is check the regular exit handler (CSM set). VMS ENTERED Control 44 Due to not and an get is MBox EHM to active and bug, a as parity an error SBE. The an interrupt, but Check for any other MBox errors. in check MBox VMS machine ENTERED <05>) and FPD (PSL IE.EHM.ENTRY (see note latched. eliminating the VMS (EHSR to latched delivers get set passed from not is <27>) is not set. 61). bit detects C0Q the MDECC <21> (SBE) single may bit errors by may error not first If EHSR <09> is set we had a hidden OPBus parity error or TB parity error. 1In this case EHM will go to EHM.EXIT to figure out what mode we are in and how to exit EHM (see note 38). | If MSTAT1 EHM.EXIT. indicates any of the | following | | errors, l. CPR B 2. CPR A PE (MSTATI <22>) 3. ABus 4. ABus command or mask PE (MSTATI] <20>) 5. ABus address PE (MSTAT1 <195) 6. CPU write PE (MSTATI <07:04>) 7. Cacfie‘data PE (read cache) (MSTAT1 <03>) 8. Cache If MSTAT2 EHM.EXIT. PE (MSTAT1 data PE byte to go to <23>) (MSTAT1 write indicates <21>) PE (MSTAT1 any of the | <00>) following N 1. ABus bad data code 2. NXM 3. CP (MSTAT2 go s NOTE is is (MSTAT2 <14>) <03>) I/O buffer error A-32 (MSTAT? <02>) errors, ERROR HANDLING MICROCODE (EHM) FLOW CHART go to 1f MDECC indicates any of the following errors EHM.EXIT. Bad data flag, BDF (MDECC <22>) Double bit error, DBE (MDECC <20>) 1. 2. 3. Address parity error, APE (MDECC <19>) <07> to see if there hasr. beeGOn At this point we check IfEHSR not, there is no MBox erro an MBox interrupt. | to EHM.EXIT. and there is ank of the above errors, If there are none an SBE, so chec been e must have MBox interrupt, ther SBE.FIXUP (see EHM. to go ). If it is set, MDECC <21> (SBE note 46). NOTE 45 (doesn't indicate an SBE), be then If MDECC <21> is not set the ld possibly to see if it ocou we must check k bit only can r parity error. This erro ‘unrecorded chec 1is MSTAT 1 <29:26> to occur on a CP or ABus refill, so go e, cycl ll refi checked for these cycles. If not| a EHM.EXIT. | been e, then the error must ,have If it was a refill cycl and SBE, <21> C MDEC will set an unrecorded SBE, so EHM 0 = CO0O on (SBE clear the syndrome bits, MDECC <14:09> . syndrome) NOTE 46 X le bit error occurs, thetoMBox EHM.SBE.FIXUP -- When a singfly, the delivers good data fixes the data on theinterrupt. Because this 1is a cache, and requests an but it should ten bit won't bek set, refill cycle, the writ and ed writebac will update be set for an SBE (the forc ten writ the use of a timing bug, correct memory). Becaset, so we will insure . that it get bit doesn't always ing it back gets set by reading the data and writ (longword in and MSTATI <25:24> MEAR (block address)to gene ess, and the addr e cach rate the error) are used data (from the e writ , then EBox will read cache physeical This will ess. addr same the the EMD) back to cach at gets set. insure that the written bit MBoOx At this time the MBox will be reset to clear the error registers. NOTE 47 right 8 to get the bits into EHSR (ESC DA) is rotated ten into EHMSTS (ESC 18). correct position and writ A-33 ERROR HANDLING MICROC ODE NOTE 48 The EHM post currently NOTE 49 We have the IPL is a 1F. If interrupt 50 how to single 1is bit below 1D, bit error. bug that the for figure exit and clear we will is at until the IPL just drops ignore the macroinstruction. will control call store register, the back to calling the are If and VMS SCB vector for SBE at the IPL is continue and fixup ENTERED interrupts below 1D. whole thing get IPL be an seen above with the 1D, | called by EHM vhen it store parity error. This EHM.READ.CSES routine EHM.READ.CSES the EHM above 1D, we shouldn't this time, it should not error EHM.CS.CORRECTION finished trap VMS via EHM.CS .CORRECTION is a rout ine detects an FBox or IBox control ‘routine we ' to causes MBox 1IPL a SBE out what mode EHM. error cleared we'll next NOTE a is single There and EHSR ENTERED. 54, processor will in had code. If (EHM) FLOW CHART correction will which data pass will to read the for the CSES control immediately routine. - back pass to control When an IBox or FBox cont rol store parity occurs, the box will hold the error information, and report the error to the EBox via error microtra p. The EBox will transfer the control applicable control store to A control by The console. This EHM call code clear that EHM.READ.CSES RBUF it may -- will Store interface EBCS €rror for that parity control to <07> bits send the has sequence (DONE) the in the a 3-bit console the EBCS code and 1is to EBCS will to the indicates parity error. to receive the passing sequence. correction CSPE indicate buffer note 51). routine is called and by Séquence, the the console console to error correction. store receijve <31:27> from to by EHM to the (RBUF), then the for control EBox CSPE | wait indication prior transfers EHSR <31:27> DRAM) load (see This correction correction, EHM error it (or EHSR correction interrupts store information the store parity EHM.READ.CSES store to setting of the priority encoder control will control EHM when <31:27>. enable a console NOTE 51 the error register store initiated which parity error is correction buffer, reset correction to and A-34 EHM DONE (RBUF completed <07>), the an control will then information read the from the CBus clear request set has to load the the it into ESC control console, 2D (CSES) . store parity clearing the ERROR HANDLING MICROCODE (EHM)'FLOW CHART will wait 16 EBox cycles to allow console interrupt. tEHMtime to go away. the console interrup control ar DONE (RBUFC <07>) and return EHM will then cle ~ routine. to the calling NOTE 52 routine to check a EHM.TST.PE -- EHM calls this pad RAMs for in the GPR/scratch particular location rout the A-side both ce will sour parity errors. The theine AMUX there is a If . BMUX ~and and B-side through MCA will latc side got r heve whic h parity error, the PDP , any error will cause a micr otrap the error. In additionsetup for ESC parity testing was to rthe to 8, but paof trap to w for the trap to 8. Any set STATE <07> to allo ed rol to Dbe passthe set, will allow cont 8, with STATE <07>rout to , trap ine generating the back to the . microinstruction following the tra@pad”micrminstructian r, a 100 is written to If there is a GPR/ESC parityrserro ting a "1" to EBCS <08> EBCS to clear EBOX erro (wri <04>). clears bits <15>, <12:09>, and by only, it is corrected If there is an error on one toside rnin the bad side, and retu bad,g side copying from the good r" s are by way of the "no erro return. 1f | both side | the error return is taken. set for GPR B parity For error reporting, EHSR <02>setis for GPR A parity errors. errors and EHSR <01> 1is error EDPSR <23:16> is loaded with the GPR/ESC parity | | address. NOTE 53 IBox compresses MDBus EHM.HIDDEN.OPBUS.PE -- Because the parity before sending the byte parity into OPBus long,word the EBox only checks and to the EBox data and parity there is a possibility , uses parity on the bytes that itct bad pari ty if it is using that the EBox will not dete ‘an even number of bytes. lets look at two examples. To illustrate the situation, the EBox requests a longword, , In each of the examples and the longword is that just happens to be all zeros, ssar y. is nece not in cache, thus a refill Example A-1: OPBus no parity error , all =zeros over the The MBox sends four bytes of data parity bits, logic four MDBus to the IBox along ofwithdata. odd The IBox compresses ones, one for each byte to provide XNOR) the parity bits XOR (XOR and invert, or ty, of four a logic one. (The OPBus longword odd pari inverted to which is logic 1's provides a logic zero, sends this odd IBox The .) ity par word long s provide OPBu bytes of data. The EBox parity bit along with the four s is odd (OK) and ineit cha&%suth@~parity, determthat A-35 ERROR HANDLING MICROCODE (EH M) FLOW CHART continues Example executing A-2: the OPBus instruction. Parity Error In this case, we assume that ther e is a double bit the on As data a read result are of disabled IBox, thus the from the the double at the MBox The IBox provide sends zeros odd which compresses OPBus will bit time all zeros byte parity. parity error. | parity error, the the This the data IBox is all generate parity. a logic MDBus to be zeros combination the byte parity longword also one as continues to is a hole the time implement check The for in that a determines execute the it hardware this following have occurred. for <07> the OPBus that it to a logic 0, for OPBus odd The | £ - (OK) EBox fi% has . L detection discovered and it The take network. was EHM was the too late modified appropriate are needed for this By () to ~ to action. problem to b N > ot 2. One of the a. CP READ b. CP READ and cache data pari ty error and (bad data flag (BDF) or double bit error (DBE) or address parity error (APE)) (see note 56) via trap following and ABus 4 or 6 (see note three conditions bad data code 54). ) exist, . (see note CP refill and no byte write and error note 1longword 57) and (BDF or DBE s <24> is set if there was an MBox indicates an MBox interrup t. for P , four EHM entered happen | a trap to 55) 1is or and a signifies cp read A-36 ABus of bad 1/0 F%f & EHSR . ) —~ data code. Space | on | 4 5 APE) NOTE 55 MSTAT1 <29:26> = E (cy cle type = Cp READ) and MSTAT2 <14> is set, which only § ~ longword is instruction. solution. conditions (see EHSR the CPU error was condition requested 54 and l. c. NOTE data » ) just swallowed bad data. This the before this case, (the XOR of four logic 0s provides inverted to provide the 1logic 1 EBox checks parity, and to should provide bits 1In drivers for . sent is longword parity). The error array. and This the can SBIA ' - ERROR HANDLING MICROCODE (EHM) FLOW CHART receives Read Data Substitute (RDS) on the SBI. will send data length/status = 11 the ABus data. NOTE 56 | MSTAT1 <29:26> = E reset (no ABus 57 MSTAT]1 <29:26> (BDF), <19> = data 9 (DBE), (cycle MSTAT2 and MSTAT1 type = CP or <18> (APE) The SBIA code) with | (cycle type = CP READ), bad data code), (DBE), or <18> (APE) is set, parity error) is set, NOTE (bad MSTAT2 <21> <03> READ), is set, (not a byte write), and MSTAT1 <25:24> (error is on the requested longword). = <14> (BDF), (cache data MSTAT1 <K21> MSTAT2 <15> = 0 MEAR <K03:02> NOTE 58 EHM.POSTPROC -- The EHM routine EHM.POSTPROC figures what mode we are currently pointer for the stack frame. CSM.STATUS 1is register, just loaded 1into in case CSM is in and sets up the CSM.T0, active. a is <19> CSM out stack temporary If we are on the interrupt stack (PSL <26> = 1) then ESC D8 (EHM.SP, an EHM temporary register) is loaded with the contents of SP. If we are not on the interrupt stack then ISP is transferred to ESC D8. ESC D8 will end up with the stack pointer that will be used by the machine check handler as a pointer for pushing the stack frame. If ESC CO (CSM.STATUS) <07:00> is occurred 'RETURN.A. while CSM was (See note 41). not active =zero, and the the error exit will be If ESC CO <07:00> is zero and EHSR <05> is set (VMS ENTERED), the error occurred while VMS was active, but after EHM had passed control to VMS, exit via RETURN.B. (See note 42). a double error, If neither of the above cases apply, it 1is the error (not a double error). EHM will pass control VMS machine check handler, (single - bit error), to or to VMS VMS instruction (SBE with IPL at 1F). NOTE 59 EHM.BUILD.MBOX.SF -- This via to execute SCB the first to the vector routine will read the stack Build MSTAT1 (ESC 25) by reading MBox registers 2. Bfllld MSTAT2 <15:00> r@glst@rfi 5C and 58. reading 23, 24, and 20. 26) by MBox frame, 1. - A-37 54 next macro (See note 43 or 49.) registers to build the MBox portion of ESC 25 throughESC 2B. (ESC so 2C, MBox ERROR HANDLING MICROCODE 3. Builild (EHM) MDECC registers FLOW CHART <23:00> 70, 60, 4. Build MERG <15:00> 18, and 14 5. Build CSHCTL register 6. If (ESC and (ESC 8. Extract read by NOTE 60 by (ESC reading 29) by MEDR (ESC the MBox 2B) page by PA <29:20>. DMAs are reading 54 Load = 1), MBox the at PAMM | enabled, they will 78. from MEAR the offset CONFIG be MEAR read MBox register PA <29:20>, (PAMM) 1load Otherwise reading MBox number, register DB). EHM.MBOX.RESET -~ This routine is used to error registers and clear all of the MBox register bits., If MBox reading MBox registers | (EHSR <26> from MEAR.SAV (ESC 7C and load MEAR. <20:16>. - 28) <07:00> MEAR has been saved Build by 04. (ESC 2A) register 7. 27) 50. and provided 1into MSTAT2 reset the MBox fault insertion disabled to prevent clearing any I/0 related errors. EHM will wait 16 after setting MBox register 18 <04> (disable DMA) proceeding further. cycles before MBox register 10 <01>, INV CHK EN, is cleared, as is register 14 <04:00>, the generate even parity bits. MBox Then CMD BAD PAR, MBox register 18 <03> is cleared. This bit is used to generate even ABus parity during an ABus command/address cycle. Finally, all MBox error registers are cleared (EBox microcode MCF). If DMAs must were enabled be upon re-enabled sequence. DMAs are entering prior to enabled by this routine, returning clearing to MBox <04>. NOTE 61 the they calling register 18 | IE.EHM.ENTRY -- This routine is not apart of the EHM, it is actually the first part of the exception handler microcode. It will transfer the current mode bits (PSL <25:24>) to the previous mode bits It will then load the VMQ from address The have is If of the handling resulting data to be used equal 01 check value and 11 is special meaning. handler, of is routine) they <23:22>). SCBB + and the new PC. 1If (PSL read Bits equal 00 4 (the virtual cache physical. <01> and the kernel <00> stack unless presently on the interrupt stack. use the interrupt stack. For the machine the interrupt stack is always used. A 10 indicates that user microcode is to be used, invalid, halt. For the a value of 11, with the ~A-38 ERROR HANDLING MICR@CODE,(EHM) FLOW CHART VAX &6@0/8650, 'that‘m®an$ gbing7t0 the CSM console wait loop. = 1, For PC <01:00> = 00 or 01, if ESC 2F (old PSL) <26> PSL <26> we are on the interrupt stack, so we will set (write WBus register 01). For PC <01:00> = 00 and not on the interrupt stack, ESC E0 (kernel stack pointer) will be transferred to the SP. For PC <01:00> = 01, and not on the interrupt stack, ESC E4 (ISP) will be transferred to the SP, and PSL <26> will be set. For any of these cases (PC <01:00> = 00 or 01) the old will be pPSL will be pushed on the stack, the old PC through 17 (ESC frame stack the then stack, pushed on the 2D) will be pushed on the stack. EBOX microcode will then enable IBox references and execution will start at the new PC, the virtual address of the machine check handler. NOTE 62 CSM.ERROR -- If CSM is active, the routine CSM.ERROR, a part of CSM microcode, is entered. The CSM entry code will be checked. If CSM was executing the FIND 64KB or FIND RPB routines when the error occurred, control will be passed back to the restart address for the routine being executed. I1f neither of these routines is being executed, the code will check to see if XBUFC <07> (READY) is set. If not, microcode will loop waiting for the console to set it (signifying that the EBox microcode may use the transmit buffer). When XBUFC <07> is set, the EBox will write a 4 in XBUFC <05:00>, clear XBUFC <07> to indicate to the console that the EBox has written the XBUFC, then go to the CSM wait loop. The value of 4 in XBUFC <05:00> tells the console that there has been a hardware error while CSM was active. NOTE 63 CSM.ENTRY.DE -- For a non-EBox double error condition, the routine CSM.ENTRY.DE, a part of console support microcode, will be executed. This routine will stop IBOX references and load the following scratch pad registers. 1. Load ESC Cl with the contents of the VMQ 2. Load ESC C2 <11:04> with the contents of the SC 3. Load ESC C2 <03:00> with the contents of the SPADR 4. Load ESC C3 with the PSL A-39 ERROR HANDLING 5. MICROCODE (EHM) FLOW CHART Load ESC C4 with the contents of the EMD Control will then be passed to the CSM wait loop. P A-40 APPENDIX B KA86/KA865 CONTROL STORE AND DISPATCH RAM CORRECTION FLOWS 'KA86/KA865 CONTROL STORE AND DISPATCH RAM CORRECTION FLOWS INTRODUCTION This Appendix contains a brief descriptio and n a flow chart that describes how the Console handles interrupt requests from the VAX8600/8650 to correct Control Store (CS) and Dispatch RAM (DRAM) Parity Errors. In summary, upon detection of a CS/DRAM Parity Error the Error Handling Microcode (EHM) will interrupt the Console. 1If the Console error 1is and in Program return Machine to (PIO) the Check Stack Frame and append the Machine Check will If I/0 control the mode, it (EHM). will The call to error message Keep Alive ~ Snap File on the Fail terminal (See Condition. (SNAPn.DAT). will to correct will the build a was called Consol will print e an the VMS Machine Check Handler which the System Event File (ERRLOG.SYS). correction attempt failed (or if the an MBox Control Store Parity Error) handle attempt EHM Chapter This will 4 Console the Table result See Appendix D. in 10) and the generation to declare a of a NOTE Although the Console will attempt to correct MBox Control Store RAM Parity Errors it will not attempt to restart the system. already word in and The reason being, acted on the contents error. Thus the state of restart is not feasable. a the MBox has of the Control Store the MBox is unknown e, If the system attempted. and return is in Console Instead the the Console I/0O at the hardware, nine of the have parity checking networks. no a The EBox o The IBox Control © The IBox Dispatch RAMs © The FBox Dispatch RAMs o The FBox Adder © The FBox Multiplier o The MBox Control The signal generate a then Store on the will be terminal CS/DRAMs in the VAX8600/8650 (There is no parity protection on Of the nine the following seven can Console. RAMS Store RAMs Module Control Module Store goes 3-bit correction message ten e Store Control RAMs Store to RAMs RAMs a CS/DRAM Parity Detection Network detec ts error signal to the EBE module (print EBES5) EBCS. will o Control print | the MBox Access Violation RAM). be corrected on the fly via the the mode, will Prompt. Looking CPU When (CIO) Console a encoded priority signal an error where it encoder (print it will is pass latched in EBE3) which (EBE CPU ERR <2:0> H). This signal goes to the Console (print CL09) where it is OR'ed to become CLO9 CPU CS PE L and latched as CL02 BCPU CS PE L. CLO02 BCPU CS PE L B-2 KA86/KA865 C@NTROLaSTORE;AND DISPATCH RAM CORRECTION FLOWS will cause a T-11 interrupt at priority 6, interrupt vector 110. In addition, the 3-bit encoded signal (EBE CPU ERR <2:0> H is latched by the console in MCSR3 <2:0>. | | Upon detecting the interrupt, the console will pass control to a software routine (MCPECR) which will load in the neccessary overlay and perform the error recovery attempt. This code is outlined on the flow chart. NOTE If the Error console is is in PIO mode succesfully corrected and a CS/DRAM there will Parity be NO error message indication on the console terminal. The only indication of the error will be in the System Event ‘File. Therefore you must run SPEAR to gather the error data. ANALYZE/ERROR LOG or | - KA86/KA865 CONTROL STORE AND DISPATCH RAM CORRECTION FLOWS "?MWM EXECUTE )| /{\wfl‘m“m@ PROCESS INTERRUPTS | - N o A N A & RAM | CORRECTION REQUEST » MCPECR 1CSPE SYNDROME N GREATER THA «WORD SIZE DISABLE EXTERNAL COMPLEMENT INTERRUPTS Y MBTERR —e| MSG PTR 18 l 17 ISYNZRO—e RO MSG PTR ] B 14 SYNGTR — BAD BIT MSG PTR 1 ........_.T...,___ k] SUSPEND KAF TIMER CLEAR 2 ECR " ICLR RBUFl DONE RETRY COUNT 9 | WRITE CORRECTED SET FL.CSP --———Tm WORD ’ ' INTERRUPT \ N CODE - 0 & v / PROGRAM LOAD CSOR 1/0 MODE & 20 | CSPERR MSG PTR TM b READ BACK uWORD 2% | XOR GOOD WORD WITH WORD READ | ,, PRINT MESSAGE ' 8 STOP CPU CLK l f B I FORAM PE 1l MCS PE 1 1 KRELOAD FORAM T ECS PE 1 FBACS PE ICS PE 1 ] READ mMcC | IDRAM PE | STEP CLK TO SHIFT PATH T1/73 8 w0 GET REG STEP CLK TO TO/T2 1" ; T INCREMENT RETRY COUNT FBMCS PE 11 | GET BAD : PCFAIL MSG PTR 4ADDRESS . ERROR SET UNCORR v READING ADDRESS BIT FOR CSES 23 § /,. SET RAM STATE INVALID BIT IN SCORE BOARD COREKT: UPC ERR wein MSG PTR P CALSYN: 22 12 | 12 ’ 23 Py ECRREC: N GET GOOD ECC FROM TABLE ] 14 g MUNREC MSG PTR GOOD ECC IN TABLE GENERATE ECC FOR BAD 4WORD | . NO ECCD—s MSG PTR 1 ‘e RESTART CPU CLOCK O, 28 W NS L VAX 8600 Console CS/DRAM (ECC) Correction Flow B-4 (Part 1 of 2) T KA86/KA865 CONTROL STORE AND DISPATCH RAM CORRECTION @ | GENERATE CSPE RST LST CyC VIA S08 OF GOOD DATA |AND BAD DATA —-A--A-T-—nh-n— k] N/ fLcse \Y = () n CLEAR KAF TIMER | ENABLE EXTERNAL INTERRUPTS NN — J RESUME KAF TIMER ] DISMISS INTERRUPT 32 VAX 8600 Console CS/DRAM (ECC) B-5 Correction Flow (Part 2 of 2) FLOWS KA86/KA865 CONTROL STORE AND DISPATCH RAM CORRECTION FLOWS NOTE 1 The T1l1l Console processes interrupts at Instruction Register Decode (IRD) time. If the VAX8600/8650 has posted an interrupt request to correct a Control Store or Dispatch Ram parity error it will be processed at this time. NOTE This is RAM (CS/DRAM) 2 the beginning of the correction Console routine. routine are resident and other parts read from the Console Load device as Control Store/Dispatch (MCPECR). Parts of the are overlays that are needed. NOTE The names of the subroutines that are called by this routine are labled on the flow chart. For example "I.CSPE:"” 1is the name of the subroutine that disables external interrupts and suspends the KAF Timer. The Console disables external interrupts (thus giving the ECC correction request the highest external priority). The Console also suspends the Keep Alive Fail (KAF) timer. This prevents a KAF time out from occuring while the Console is in the process of correcting a CS/DRAM parity error. NOTE 3 The Console assures that RBUF "DONE" will be set to notify Microcode) that the Console request. The Console (FL.CSP). This flag will successfully NOTE 4 corrected "DONE" is also sets a remain set the cleared. the VAX8600/8650 has completed Later software until the status flag Console has parity error. The Console Checks to determine if the Interrupt Code is which is through 7. invalid. The RBUF (Error Handling the correction only valid interrupt codes zero are: 1 If the interrupt code is zero the console moves the value "INTERR" into the Message Pointer. Later this pointer will be used as an index to print the following message: ERC-E-INTERR CSPE (see NOTE 5 The If Chapter 4 interrupt, Table code invalid (0) 10) Console checks to determine if it is in Console I/0O Mode. is it will not attempt correction. 1Instead it will move value "CSPERR" into the Message Pointer. Later this it the pointer will be used message: (Where the parity error.) "ram DCN-E-CSPERR "ram (see Chapter 4 id" Table as id"TM an will Control 10) index to print correspond Store to Parity the the following RAM that Error had KAB86/KA865 CONTROL STORE AND DISPATCH RAM CORRECTION FLOWS thus The Console uses the Message Pointer as an index and to Refer e. Messag Error rate approp the identifies and prints of ption descri ed detail a for es Messag e Consol the section on NOTE the message. Scoreboard. The Console increments the count in the Microcode Consol e "SHOW the via This information 1is then displayed NOTE UCODE" command. The Console stops the clock at T3 in order to correct FBox or MBox Control Store Parity Errors. The clock is stopped for FBox CS parity errors because the FBox uses the clocks NOTE differently than the rest of the system. The clock is stopped for MBox CS parity errors because the MCC Shift Path must be used to read the CS Address and Data. The other CS/DRAM parity errors can be corrected with clock running. The FBox does not latch the micro-address and data for FBox Therefore, the Console must reload the Dram parity errors. entire RAM. Note that the Console does not check to assure that the FDRAM 1loaded properly. Multiple FBox errors will NOTE eventually result in the VMS Machine Check Handler turning off the FBox. NOTE 10 The Console reads the MCC shift Path via the SDB. The Shift NOTE 11 The correct state for NOTE 12 NOTE 13 NOTE 14 Path returns the the Micro-addesss and bad data. Console steps the <clock to the correcting FBA or FBM Control Store parity errors. The Console uses the SDB to read the Micro-address associated with the RAM error. If the Console encounters and error while reading the Micro-address it sets the UPCERR status flag. Evenually this will result in a non correctable error. The Console uses the SDB to read the Microword associated with the RAM error. The Console has a set of ECC tables for each location in each loadable RAM File. The Console uses the Micro-address it read via the SDB to look up the ECC character. If the Console is unable to find the ECC character it sets the NOECCD status Evenually this will result in a non correctable error. flag. KA86/KA865 CONTROL NOTE looking up the correct ECC chara cter'in the approprate the console generates an ECC character for the Microword read via the SDB. It then does an exclusive "or" of the 15 STORE AND DISPATCH RAM CORRECTION FLOWS After table it two ECC used NOTE 16 NOTE 17 NOTE 18 to characters. correct The result Microword. the is the syndrome If the flag. syndrome equals zero the console sets the This is a recoverable error condi tion. If syndome the detected the Evenually this If indicates that Console will will result in a multiple set bit the a non syndrome is greater than the SYNGTR status flag. the the set that will SYNZRO status parity MBTERR correctable error was status flag. error. RAM size the Console will For example, if the syndrome indicates that IDRAM bit 22 is in error this flag will be because the 1IDRAM is a 20 Microword. Evenually setting flag will result in a non correcta ble error. NOTE 19 The Console retry count, SDB). NOTE NOTE 20 21 Console Error Error condition. The EBox. Console to writes This will the corrected the the bad the corrected bit the Microword, word was called to correct an it Loads the Control Store reads the successful Console clear the Microword EBox back word it clears the to the EBox Data Control via the just wrote. CSPERR Store If error and time Console setting 23 With the non will set this flag sets the approprate Scoreboard. try the will exception recoverable Console the it will of error 4 more PCFAIL result MBox Error in a Control conditions UNCORRECTABLE "RAM times. STATE If all Store (CSDR) Parity compares status Flag. correctable Store Parity converge Error Status INVAILD" bit at the the write was attempts Status non the (via Control Register SDB set this clears RAM flag. 22 If the Console was unable to write the Microword corr first NOTE and If the Parity in the it NOTE complements be " ectly fail the the Evenually error. Errors, this point. all The Flag and then in to Microcode sets KAB86/KA865 CONTROL STORE AND DISPATCH RAM CORRECTION FLOWS NOTE 24 MBox Control Store Parity Errors are consid ered non correctable because they may leave the MBox 1in an unpredictable state. The Console sets the MUNREC status flag and increments the MBox CS parity error <count in the Scoreboard. , NOTE 25 The Currently Console status status Error to word 26 If is a allow the an will (via turn clears 27 was 28 this point recoverable clears NOTE 29 1. 2. the a common If the Sets Prints read the RBUF, the the the CSES When the If the "DONE". Store the error and continue hardware) then the ERR SERV REQ". Parity in CSES, (by parity "CSBR CLR Control Console Control Error turn checks status Store to flags Parity see to Console This and starts unable status to if are set. Error Flag routine. CSPERR RBUF build directly Store generate error exit set builds in generates the Error enters a timeout loop waiting for the EBox RBUF "DONE". The Console does not proceed EHM clears RBUF "DONE" or the timeout count Console was the will RST LAS CYC" which Microcode at Vector 8. The Console then (EHM) to clear until either the At then VAX8600/8650. in a loop waiting for the Console RBUF "DONE" will release the loop called EBox expires. NOTE and the was Control SDB) the "EBDA CSPE Handling NOTE EHM to to console setting EBox the the clock sent error. Console correct CPU be Microcode RAM, the the will built the processing NOTE that Handling correct and restarts word correct the any If of not (FL.CSP) parity the the and error non Console joins it: flag following error message: DCN-E-CSPERR "ram (see 4 Chapter id" Table Control Store 2) Parity Prints one of the following messages parity error was non correctable ECR-E-MBTERR "ram ECR-E-MUNREC id" MCS ECR-E-NOECCD ECR-E~PCFAIL ECR-E-SYNGTR MBox Not "ram id" No "ram id" "ram ECR-E-SYNZRO id" "ram ECR-E-UPCERR id" Syndrome Syndrome "ram id" Can't (see Multiple Error that explains Chapter bit Error, Recoverable 4 Table why 10). UNCORRECTABLE ECC Data in Table Correction Attempt Failed > RAM = 0, Read Box Size, UNCORRECTABLE UNCORRECTABLE Address the KA86/KA865 CONTROL STORE AND DISPATCH RAM CORRECTION FLOWS 4. Print the "GOOD" Data, "BAD" Data, and the status word sent to CSES. NOTE 30 If the Console was unable to read the micro-address associated with the parity error it will print the exclusive "or" of the Good NOTE 31 and Bad Data. This is a common exit point. If the parity error was corrected the console will clear (reset) the Keep Alive Timer the If mechinism. interrupt external the and enable the Console was unable to correct the error then it will leave the external interupts disabled. Keep Alive Fail. Eventually this will result . ;«wwm%h in a G »»»»» NOTE 32 Finally, the Console enables the Keep Alive Fail Timer and error was correctable no the If interrupt. the dismisses error message will have been printed and the EHM will continue to build the Machine Check Stack Frame and report the error to VMS. If the Console was unable to correct the error then a Keep Alive fail will occurr and the Console will Snap Shot the state of the system. : APPENDIX C VMS MACHINE CHECK HANDLER (VMSMCK) FLOWS 86008650w the vAaX The w chartss for a set of Entflo ix contains This Append ted below. tes flo ry Vector dlelis (SCB) trol Block VMS each System Concri ) evalua Tab Han r (MCKtake hine Check act how the ermMac le charts des ditbe The . ion to s what and det ine error con ion es. pos erence pur ControItl luded for ref w charts is inc preceding theEntflo ry Vectors for the entire VAX 8600/8650 System 1ists the Block. Entry Vector 004 054 058 05C 060 Name Machine Check Array Single Bit Error SBI 0 Alert SBI 0 Fault SBIA 0 Error Code Type Fault/Abort 1 1 Interrupt 1 Interrupt 1 t rrup Inte 1 Interrupt VMS MACHINE System CHECK HANDLER Control (VMSMCK) Block for the FLOWS VAX8600/8650 Entry Vector Name m“mm”m m mmwmmm emmflm mmmm“m mmmmwf lnwm 000 004 008 00cC 010 014 018 01C 020 024 028 02C 030 034 038-03C 040 044 048 04C 050 054 058 05C 060 064 068-080 084 088 08C 090 094 098 09cC O0AQ 0A4 0A8 0AC 0BO 0B4 OB8 OBC 0COo 0C4-0EC OFO0 OF4 OF8 OFC 100-13C 140-17C 180-1BC 1CO0-1FC Unused Machine Check Stack Not Kernel valid CPU Power Fail Reserved DEC Opcode s and Privileged Instructions Customer Opcodes Reserved Reserved Operands | Reserved Addressin g Modes Access Control Vio lation Translation Not Val id Trace Pending Breakpoint Compatibility Arithmetic Unused CHMK Opcode CHME CHMS CHMU SBI Mode Opcode AST Delivery Software Request Request Software Request Software Request Software Request Software Request Software Request Software Request Software Request Software Request Software Request Software Request Software 03 04 05 06 07 08 09 0aA 0B 0C 0D OE Request OF Interval Timer Unused Console Block Storag e Unused Console Console SBI SBI SBI SBI wmm m mwmmn 3 Fault/Abort 1 Abort Interrupt Fault 0 Fault 0 Fault/Abort Fault . Terminal Receive Terminal Transmit 0 Req 4/Unibus BR 4 0 Req 5/Unibus BR 5§ 0 Req 6/Unibus BR ¢ 0 Req 7/Unibus BR 7 | _ a . 0 0 0 1 1 1 1 1 1 Interrupt 1 3 Interrupt 0 Interrupt Interrupt Interrupt Interrupt Interrupt 1 1 1 1 Interrupt 1 1 Interrupt 1 Interrupt Interrupt Interrupt Interrupt h B 0 Interrupt Interrupt Interrupt Interrupt a 0 0 0 3 Interrupt Interrupt Interrupt - 0 0 0 Interrupt Interrupt Interrupt Interrupt - 0 Trap Trap Interrupt - 0 Fault Fault Fault Fault Fault/Abort Trap/Fault Interrupt Interrupt Interrupt Interrupt - 1 1 Trap Opcode 0 Silo Compare Array Single Bit Error SBI 0 Alert SBI 0 Fault SBIA 0 Error SBI 0 Fail Unused Software Request 01 Software Request 02 or *mw Trap Opcode Software mwm — — . - ~ . - - 1 1 1 1 1 n B 1 1 3 1 3 1 1 1 1 1 1 . VMS System Control Block for the MACHINE VAX CHECK 8600/8650 HANDLER (VMSMCK) FLOWS (cont.) Entry Vector Name Type 200-24C Unused 250 SBI 1 254 258 Silo Compare SBI SBI 1 1 Fail Alert 25C SBI 3 Interrupt 1 Interrupt Interrupt 1 1 260 264-2FC 1 Fault SBIA 1 Error Unused Interrupt 1 Interrupt 300-33C 340-37C 380-3BC SBI SBI SBI 1 1 1 3C0-3FC Req Req Req 1 3 SBI 1 Req Interrupt Interrupt Interrupt 1 1 1 Interrupt 1 Vector Name 400-5FC Unused Vector Name 600-7FC Unused =0 Service Service Reserved Service * Note: page 4/Unibus BR 4 5/Unibus BR 5 6/Unibus BR 6 7/Unibus BR 7 Type Code 3 Type Code 3 Description W Code The (200 on on Kernel Stack (or if Interrupt Stack via WCS (Halt if there Interrupt Vectors words). Microcode after - Code Adapter needs it The offset examines service. selected is added CSLINT Stack) no WCS) for SBI-1 are is Interrupt by offset the EBox <22:21> and from SBI-0 Interrupt by a Handling determines which VMS MACHINE CHECK HANDLER ‘(VMSMCK) FLOWS FLOW CHART CODING DESCRIPTION The following flowcharts contain terminal points, decision points, modified decision blocks, and statement blocks. The terminal points at the top of a flowchart contain the macrocode name designating the entry point for that routine. The terminal points within a routine are subroutine entry points where some action takes place and then the routine resumes processing through the flowchart from that point. Terminal points at the end of a routine {or decision point) are exit points. g Rectangular blocks within the routines are statement points. ) Entry point, macrocode routine name. ( Statement point g Subroutine entry point. Indicates return here unless stated otherwise. < A \ Decision point f \ I ( ( \ ) Exit point ) Exit point CAUTION These flowcharts apply to VMS version 4.7 and are subject to change. MR-1087-0775 VMS Machine Check Handler Flow Chart (Part 1 of 50) o ,Jm\w"mwvw,fi VMS MACHINE CHECK HANDLER (VMSMCK) EXESMCHK All machine checks sre vectored to this entry point. As soon as the machine check handler is invoked, it sets up the stack as follows: :(SP) AP points to the beginning of the machine check log on the stack. Two longwords are immediately pushed on top of the machine check log, and are referenced as negative offsets from AP. These two longwords are input arguments to a routine that is called to check for a user-declared machine check Wmmmmumwmmwmmwmmmmmw the exception PC/PSL to be on top of the machine check log on the stack. ( EXESMCHK ) (EBCS<4:1>) PROCESS ABORT BIT IN EHMSTS (EHMSTA<17> = 1) | SET PROCESS ABORT BIT IN I | ABORT_BITS J K| NO MBOX FE "‘—< (EBCS<15>=1) ["LoG MCHK__CODE INTO STACK FRAME l FBOX sfima REQUES NO wmm«:za:mn ( FIRST MATCH } NO MRIORIOTE VMS Machine Check Handler Flow Chart (Part 2 of 50) FLOWS VMS MACHINE CHECK HANDLER (VMSMCK) FLOWS ANY BIT SET IN EBCS (EBCS<12:9>) ( FIRST MATCH > i LOG MCHK__CODE INTO STACK FRAME e i » IBOX ERROR (EBCS<13>=1) 4 ( FIRST MATCH \ ) LOG MCHK_CODE INTO STACK FRAME CALL APPROPRIATE SERVICE ROUTINE BAD__MCHK MR1087-0712 VMS Machine Check Handler Flow Chart (Part 3 of 50) VMS MACHINE CHECK HANDLER (VMSMCK) FLOWS -~ oo g, CALL APPROPRIATE SERVICE ROUTINE =) | oD G T Com ) Com) Com) oo ) 1 GET PC-PSL POINTER AND MASK IN R1, R2 1 C EXESMCHK__TEST ’ \ YES( RO<O> = 1 | ( CHECK...MBOX..JD) C ) 2 MBOX FE IN 20 MS C BUGCHECK__POP ) MRIDB7-0717 VMS Machine Check Handler Flow Chart (Part 4 of 50) VMS MACHINE CHECK HANDLER (VMSMCK) FLOWS (CHECK..WMBO&JD , YES/ MBOX FE \ (EBCS<15>=1) > TB PROBLEM VES (MSTAT1<11:8>) NO /' mBox :mennuw) \ (EBCS<14>=1) 3 ( MBOX1D SET MCHK__EXIT YES IN B_MCHK__CODE 4 YES _/ ANYTHING LOGGED IN MCHK__CODE ‘ BAD_MCHK SET MBOX1D IN B_MCHK__CODE i ‘ LOG_MCHK ’ ) ’ ‘ MBOK_JD._.,.SEW) ) LOG_MCHK, BADMCKCOD, YES /ANY ABORT BITS SET\ \/N ABORT_BITS ( REFLECT_MCHK , POP R6-RO, AP OFF STACK! GET RID OF RECOVERY BLOCK l AND STACK FRAME ) , } CLEAR "VMS ENTERED" FLAG (EHSR<B> =0) D MRI087-0724 VMS Machine Check Handler Flow Chart (Part 5 of 50) ) AT, VMS MACHINE CHECK HANDLER (VMSMCK) REFLECT_MCHK: This code is entered if the machine check was fatal. It determines if it was just fatal to the process which caused it 'ummmwmnm;mfihmi&flwmmwm(mmwuhfimm (qeunr;m KERNEL mode If current process is in USER or SUPER mode, set up an exception on user’s stack and REI to it. If current process is in EXEC or KERNEL mode and above ASTDEL or on ISP issue a fatal bugcheck. STACK CONTENTS 00(SP):mod RO.R1,R2,R3,R4,R5,R6,AP 20(SP): 2 longword inputs for recovery block chec 28(SP): {slso AP) machine check log — Mtlonomfdiuammum. ( REFLECT__MCHK ) USER OR SUPER ACCESS MODE ‘e - STACK POINTER I GET KERNEL ( ON INTERAUFT STACK / ABOVE ASTDEL ) YES 1 STAC! SETxr nse1yv KERNEL | POINTER | ( WEMOLOG) 10 KEWEL , PUSH PC.PSL STACK Set up an exception stack for current ‘ l POPRG-ROAP FRAME "~ REPLACE o Moyare po-ponyvt exception ss if an aitered 1o look | has occusred. POP INPUTS FOR l RECOVERY BLOCK [ PoP mcHK LOG OFF STACK |_FRAME l | ZERO EXCEPTION PSL, ' { EXCEPT FOR CURRENT ACCESS MODE CLEAR "VMS ' REI ) ENTERED” FLAG (EHSR<5> =0)) ( MRI106 7074 VMS Machine Check Handler Flow Chart (Part 6 of 50) FLOWS VMS MACHINE CHECK HANDLER (VMSMCK) FLOWS (;mmmmLflwi) ( BUGCHECK__POP ) (:‘ BUGCHECK wm:) ) Clean off the stack. - (Buecnecm___nome) Uise this :"'W'Wmm’ machine - A fatal bugcheck is certain unless a 322' S?Agg,i“&m ' | Go check for a recovery block. If routine “VMS ENTERED" returns here, then no recovery block " was found and 8 FATAL BUGCHECK is o ;o EXE$SMCHK__ -h BUGCHK (L BUG__CHECK wi) MRA-1087-0727 VMS Machine Check o~ 3 CLEAR FLAG (EHSR<5>) ( ' user has declared a machine check recovery block. Handler Flow Chart C-10 (Part | 7 of 50) VMS MACHINE CHECK HANDLER (VMSMCK) FLOWS IBOX_SERV m on RLOG__PE since mmmnmmm.mummmawummmm OR| \ (IBESR<26,25,23>=1 MBOX INTERRUPT \ ygg PENDING (EBCS<14>=1) - NO/" RLOG PE {IBESR<24>=1) \ INCREMENT IBOX ERROR YES \ 3 ERRORS in2oms [ SAVE TIME G “ 1 C = ) MRIDE7-0726 8 of 50) VMS Machine Check Handler Flow Chart (Part C-11 -VMS MACHINE CHECK HANDLER (VMSMCK) FLOWS MBOX__FE__SERV An MBOX fatal error (FE set) has been detected. The nine MBOX FATAL ERRORS are: " — CP CACHE TAG PE withW 1 - CPU NXM - CP 10 BUF error - CPR PE - CP WRITE PE to MBOX register — ABUS DATA PE on DMA write data — ABUS CONTROL PE on DMA write data — ABUS DATA PE on CP 10 read data Cmaox___ramsenv) - ABUS CONTROL PE on CP 10 read data Error handling depends on the exact error which occurred. INCREMENT ERROR COUNT ,y. ) ) GET MEMORY BOARD WITH ERROR i NO IS THIS VALID SLOT y There are reasons for MULTIPLE ERROR being set that can be handled. If a CP WRITE cy to cl an ABUS e adapter (or external memory) encounters a SAVE MEMORY TYPE AND SLOT NUMBER CP WRITE PARITY ERROR then MULTI PLE ERROF(!: will be set. This is becaus e when CP__I0__BUF sets MBOX FATAL ERROR, the MBOX status contains the CP WRITE PE error. This causes MULTIPLE ERROR to set. CP_I 0_BUF will not be set. . S %E“&?%’; creLe R3 = Cycle type. (MSTAT1 <29:2 i 6>) No/ BIT SET MULTIPLE ERROR \ (MSTAT2<7> =1 ) ‘ CAC&E_..TAGWW__.SE'D NO / — \ 4 CP WRITE CYCLETYPE No/ VALIDIO i ADDRESS (MEAR<29> =1 ) _No/ DATA PE WRITE \_IMSTAT1 <7:4> =F) A CLEAR WRITE | ABORT IN ABORT__BITS A ‘ ey i BUGCHECK.._.POP—) ‘ s, CP_WRITE__PE ’ MR1087-0723 VMS Machine Check Handler Flow C-12 Chart (Part 9 of 50) VMS MACHINE CHECK HANDLER (VMSMCK) FLOWS CACHE__TAG_W_SET Muuammmavmmmwitmmw:mmmwmnmnmmmmmmm WMMMWWMummflMCHECK. Tmmmflmmmmwktwumflflmuwmmmtnamfimw.meummmmmm R3 = CYCLE TYPE (MSTAT1 <29:26>) @CHE...TAG__WMSE‘D CACHE WBITSET (MSTAT2<4>=1) | I \NO / YES /' CACHE TAG PE NO \ (MSTAT2<5E>=1) \ /P nm#r?""‘) ) (MSTAT2<3>=1 ( CP_IO_BUF ) C= MRIOBY-0T52 VMS Machine Check Handler Flow Chart (Part 10 of 50) C-13 ) w0 A“z. dd N80 TE:.!no, [ 4 | (T0IN=H<IbN2J:9A20>W154) oN L M901NmuH344N8nmS 3HQ v [ L 3AL1iVMIvSdIHOHHINIHOYW i=8<S6{L0V>5181H9ISL]VTi N=NO<dL8)>V1uY9yN53YHOHNI1HIgs)OD{1 C-14 fl fi SaA d2HH318S) 1NOIWLL 094390L90V7IgSONV 8SY H3A0W3INIO1JSd-0dO1 1H Te | SNIVLS yy3ias! fi IYONoIHWIIHANNQDONI AHOW J¥A3V4YiSnHgO$USY33H9A071v fi [ {JAsVyY3S18V1I983SY3)1VLiS NTYNDISHIOWTMH83HAOVi8SHWYW3N00S0 [ UVAY 3BV S3A / TJGONH IavH vNIigs OW 31VLS 135 vigs30007 31ViS 83HOW 139ISVEviasS$S3HA Y —dOd¥HOW907 * LZLO-LBO0HM 1NOLLON3yOLAYLSNIY13H ~ OVSWA..N~I-QHSIHH3IIINT Hv310 1NI38AH"LLI38H-1TTSddi LE 4dOd0 N'O9HV-L0SHId¥NVVHS 09 135S8dnIS AMLIY ]IACTEN4SS3TELRI00NS (VMSMCK) <Q36IZN>JW0nTsIyN3I)= S3A / LNO3WILd)ONY 139 HOHY3 0Y3Z HOUHI 907 7 SKWA auTydoeW }J3Y ONV 854 Te EVARLIENOD SHILIN [ ON3ddV ABYIWAINvSigSOL J Z ITVILING w m .nwm z*fimo 33ZZIIVVIILLIINNGGAINNNNOODD ~ 40 S3IH13Y 4 /=<81>WNSHY3{1 ~09vi0Les T3AVSSHIL IO A81d35vHHOWivni8wSnsHO1H3O83NIi 1 H344N8 @D yDA9TPUBHmord3aey)3aed)1T30(0S "f/w fi HANDLER >TWInTNYsAuSy33)IH=A<O8Yt {1 | e fl 8118i YH135I8OSWwNI@m300w 13S CHECK (1=<1E>WNSuYI) 1MN1ID1¥V4v301AHOW3W84V0S81Y8S7Li8 3x3L1S3LTMHIW dWnr 0L INIHOVJIFNdAALIIHD AHINI Lnd YHOW O4NI MACHINE (3L1=H<MtzO>NWVnIsuNEDa) 33<D90Ov£VHH>)14dOONVIH{GLLNH=IO<d¥1>Sd) S3A N1IHVOGLYHO5A1Y8S138STM |ANVHIHL10 S3A {A1H=I<A0O>D0I8H)%3078 =S<N9E1I>NuaSg$vSo3yH)A(1Y 4Vv3I1D Ol Jv3H OWLNOdHYS'J3IMdSIY1OISVNdLNSHIF3LWIYNVIZOHAd v 507Vi8S A i/\IWzIo1oBa0wHodsnayhm #xmbzivJYTEDI.mlmHA =D OLNIm&nm _{ w)«‘mfl—Tedt ~ d{Dt=0<1ZS>NzB1vHiOSHWY)I AagvHIAWXNQD3YOA1SVIA 11831T85M H1L885OLONVW3S<N3i§0SH0 A4 SysunosSNy10}UJ04IBVmOuUCmHIeNI“SIU!m$5Q39E€J01W()!/]'30hds [lfs3ia\=<14HzN0d8D>yY0w/SW¥1YnI83s4—NyIu3a) ‘ ¥I3H)HO4 VMS FLOWS oy H o ’ .«MMK a0l o ey s VMS MACHINE CHECK HANDLER (VMSMCK) CPR_PE \MwnmmwmeuCPUmntwwfwmmfiuMmmmleFw(MCflHmthmnumdlw request type. The MCF can be viewed as an “OPCODE" which the MBOX executes to address the MBOX, Cycle Parameter RAMs (CPR). CPR parity error is an MBOX FATAL ERROR because when one occurs on a CPU request an unpredictable operation takes place in the MBOX. NOTE: DMA does not use the CPRs although it is possible for whatever CPR bits are being read out during a DMA microroutine to have bad parity resulting in an MBOX FATAL ERROR trap. This type of CPR parity error is recoverable. R3 = CYCLE TYPE (MSTAT1 <29:26>) Co= 1 (MSTAT2<23> "\ No CPR PE OR<22>=1) ABUSCYCLE TYPE IN R3 \NO / CLEAR ABORT BIT = ) MRI08 70744 VMS Machine Check Handler Flow Chart C-15 (Part 12 of 50) FLOWS VMS MACHINE CHECK HANDLER (VMSMCK) FLOWS MBOX__REGISTER_WRITE routine looks for an error writing sn MBOX register. Recovery s not possible because MBOX operations are no This longer predictable at this point. ( REGISTER MBOX_. ) —~ MBOX REGISTER '\ NO WRITE CYCLE TYPE IN R3 / ( ABUS__PE ) ; ( aUGCHECK__.mP) MRI1087-0746 VMS Machine Check Handler Flow Chart C-16 (Part 13 of 50) VMS MACHINE CHECK HANDLER (VMSMCK) FLOWS ABUS__PE Abus parity errors detected by MBOX FATAL ERR come from two sources. DMA WRITE DATA and CP READ DATA from an 1/0 address. Parity errors during the Command/Adress cycle vector the MBOX micro code to a loop that will cause KAF. Anerrorona C/A that is stalled because the target array is busy can cause an intended CONTROL STORE PARITY ERROR. If the stall continues beyond one cycle, the micro stack becomes corrupt. The KAF still occurs but the indications may not be ABUS. Recovery is only possible from ABUS PE that occur for ABUS DATA cycles or CP read requests; i.e., a CP read of an 1/0 operation, R3 = MSAT1 <29:26> 1 (MSTAT1<21> \No ABUS PE OR <20>=1) NO / ABUS CYCLE \ TYPE IN R3 CPREADCYCLE TYPE IN R3 LOOK__FOR__ WRITEBACK /HAVE SBIA STATE YES , ) IN B_MCHK_CODE \NO . SET SBIA STATE IN B__MCHK__CODE A CLEAR /0 READ ABORT IN ABORT__BITS § SAVE I0A NUMBER IN RO ( IS THIS SBIA )' { Y ( BUG__CHECK ) ‘ GET SBIA BASE ADDRESS GO SAVE SBIA STATE AND RSB BACK | " SIGNAL SBIA ERROR SUMMARY IN B_MCHK__CODE ‘ ( BUGCHECK__POP ) MRIOATOTAS VMS Machine Check Handler Flow Chart C-17 (Part 14 of 50) VMS MACHINE CHECK HANDLER (VMSMCK) FLOWS LOOK_FOR_WRITEBACK At this point MBOX FE is set but a valid reason (error bit set) has not been found. For CACHE TAG PARITY ERRORS with the W bit set during WRITEBACK, the CACHE TAG PARITY ERROR bit (MSATA1 <6>) may not be set. Check for a WRITEBACK cycle and the W bit set. R3 = CYCLE TYPE LOOK__FOR__ WRITEBACK B b NO UNKNOWN WRITEBACK \ CYCLE TYPE > INR3 This is an unknown error type, log the error and issue a FATAL BUGCHECK. NO /~ \ WBIT SET (MSTAT2<4>=1) (oo ) iy, ( BUGCHECK_..POP) | GOTO BAD_MCHK MRI1OB7-0747 VMS Machine Check Handler Flow Chart (Part 15 of 50) C-18 5 M VMS MACHINE CHECK HANDLER (VMSMCK) FLOWS | u . ) mmmmam,mwmmmmumsmmmmm I GET MBOX DEST PORT CODE (MSTAT1<31:30>) o, S oreom ) ( esoxont }ves - | LOOK__FOR_SBIA | | REGISTER it wasn't the OP port or the EBOX ] Since port, it must be the IBUF port. We need TOR! o~ to save the last IBUF virtusl sddress » in R1 by the MBOX { MOVE SPT 1 ISITSYSTEM ves/ 'y [m.mmm IN R1 J > (R1<31>=1) "\ | — TO RO Y%/ IS IT PO SPACE | GET 10A NUMBER (PA<28:25>) (R1<30>=0) \ ( IS THIS ‘ 1 ) ‘ \ ( BUG_CHECK ) FOR ERROR LOG GET 10A NUMBER (MSTAT1<17:16>) [cer 1 | L =) MRIOSTO768 VMS Machine Check Handler Flow Chart (Part 16 of 50) C-19 VMS MACHINE CHECK HANDLER (VMSMCK) FLOWS LOOK_FOR__SBIA For CP__IO__BUF errors when the requesting port is EBOX the SBIA summary registers are polled ERROR LOCK. This is required looking for CPU BUF becau the EBOX does PHYSICAL and VIRTUAL reads of 10 space. It determine if the reference was is not possible to either PHYSICAL or VIRTUAL. ‘ LOOK_FOR__SBIA ISTHIS AN SBIA , ~ \NO / PR y ' GET VA OF 10 ADAPTER 4 ESR<23>=1) i J (MSTAT1<17:16>) ~ . i GET SBIA BASE ADDRESS i = ) MR1087-0758 VMS Machine Check Handler Flow Chart (Part 17 of 50) . C-20 VMS MACHINE CHECK HANDLER (VMSMCK) FLOWS () NO SAVE_SBIA__ —= PUSH RO~-R6 ON STACK FRAME ) UPDATE $ILO ADDRESS ) VALID LOOP THROUGH ALL 18 CLEAR “"VMS ENTERED"” INITIALIZE REGISTER ARRAY FLAG IN EHSR ) GET SBlA STORE SBI BASE ADDRESS < VALID SYSTEM VA REGISTERS ADAPTER CONFIG REG IN ARRAY ST THIS ONE SET "VMS ENTERED"” FLAG IN EHSR ISIT SBIA INDEX DOES SBIA BASE ADDRESS = ABUS VA INDEX IN ARRAYS FOR THIS SBIA ) ) - UPDATE CONFIGURATION REGISTER ADDRESS SET GET SBi SILO ADDRESS “VMS ENTERED" FLAG IN EHSR le - ___YE‘&_( SILO LOCKED (SBI F/S REG<16>=1) GET LONGWORD OFFSET COMPARATOR LOCKED (S8I SILO COMP REG <31>=1 [ GET SBIA CONFIGURATION REGISTER ADDRESS INDEX OF LAST SBIA ITEM GET VA OF CONTROLLER/ GOTO UNLOCK__SBIA AND RSB POP RO-R6 OFF STACK FRAME l =) WMRIORTOTVR VMS Machine Check Handler Flow Chart C-21 (Part 18 of 50) ‘VMS MACHINE CHECK HANDLER (VMSMCK) FLOWS oty UNLOCK SBIA Unlock the SBIA registers after an error. Clear all the error locks and the CPU timeout bit and the fault latch. ( UNLOCK SBIA ) e ] CLEAR "CPU BUFFER ERROR LOCK" (SBI ESR<23>) i | iy CLEARCPU TIMEOUT (S8l ERROR REG<12>) ] CLEAR FAULT LATCH (SBI F/S REG<19>) -—— MR1087-0738 VMS Machine Check Handler Flow Chart (Part 19 of 50) """ o C-22 4XD OHOW 4N Syl$1BUISUIORLYOBUDJBIPDUBYDRENuSymBuskie:014DSNYERI0LR = =Q m = < Q x - zm Oe Cc-23 =m - m mw=O ) D£09LN1 <7 33 VMS MACHINE CHECK HANDLER (VMSMCK) FLOWS CP_IO__BUF__TBIT This is the trace handler used when retrying CP__IO__BUF errors. There are 10 retries with 5 mini-retries within each retry. (CP,,..!OMBUFMTB!O } b w l RESTORE ORIGINAL SCBB DIDWE TRACE \NO FROM HERE / = PUSH RO-R5 ON STACK CLEAR TBIT AND TP i RESTORE ORIGINAL IPL ____N_O( ERROR LOG BUFFER AVAIL i SAVE MASK OF RETRY ABORT BITS SAVE MINI-RETRY COUNT SAVE RETRY COUNT RELEASE ERROR I LOG BUFFER P - l POP RO-R5 OFF STACK ( w Go report the exception. ) MRI0B7-0783 VMS Machine Check Handler Flow Chart C-24 (Part 21 of 50) VMS MACHINE CHECK HANDLER (VMSMCK) FLOWS CP__I0__BUF_UNEXPECTED This is the unexpected exception handler used when retrying CP__IO__BUF errors. CP_IO_BUF__ UNEXPECTED [ ReSTORE ORIGINAL SCBB PUSH RO-RS NO \ ~ ERROR LOG BUFFER AVAIL 1 I GeT mASK OF RETRY ABORT BITS RELEASE ERROR | LOG BUFFER . - \ ( BUG__CHECK BUGCHECK type is MACHINECHK, severity is fatal. MAIDBT-0738 VMS Machine Check Handler Flow Chart C-25 (Part 22 of 50) VMS MACHINE CHECK HANDLER (VMSMCK) FLOWS CP__I0__BUF_KERSTKNV: This is the Kernel Stack Not Valid exception handler used when retrying CP__10__BUF errors. INTERRUPT stack on entry is: 0O0(SP) = EXCEPTION PC 04(SP) = EXCEPTION PSL CP_I0__BUF_. KERSTKNV RESTORE ORIGINAL SCBB PUSH RO-R5 ON STACK NO /TM \ { ERROR LOG suUFFERAvAIL GET MASK OF | RETRY ABORT BITS GET RETRY COUNT RELEASE ERROR LOG BUFFER | POP RO-R5 OFF STACK ( BUG__CHECK BUGCHECK type is KRNLSTAKNV, severity is fotal. MR1087-0718 VMS Machine Check Handler Flow Chart C-26 (Part 23 of 50) VMS MACHINE CHECK HANDLER (VMSMCK) FLOWS m._mmmamoo Mode Fault exception handler used when retrying CP__IO__BUF errors. This is the Reserved Addressing KERNEL stack on entry is: 00(SP) = EXCEPTION PC 04{SP) = EXCEPTION PSL CP_IO_BUF_ ORIGINAL SCBB RESTORE { PUSH RO-RS ON STACK NO ‘ A | ERROR LOG BUFFER AVAIL [ GET MaSK OF RETRY ABORT BITS RELEASE ERROR LOG BUFFER - 1 _ - POP RO-R5 ] OFF STACK ZERO EXCEPTION PSL CURRENT ACCESS MODE. forept Exc Go report the exception. MRI0870730 VMS Machine Check Handler Flow Chart C-27 (Part 24 of 50) MACHINE CHECK HANDLER (VMSMCK) FLOWS CP_10__BUF__ROPRAND This is the Reserved Operand Fault exception handler used when retrying CP__IO__BUF errors. KERNEL stack on entry is: 00(SP} = EXCEPTION PC 04(SP) = EXCEPTION PSL CP_IO__BUF_. ROPRAND RESTORE ORIGINAL SCBB . VMS PUSH RO-RS& ON STACK NO/ '\ ERROR LOG BUFFER AVAIL ) i GET MASK OF RETRY ABORT BITS GET RETRY COUNT RELEASE ERROR LOG BUFFER o POP RO-R5 OFF STACK A | ZERO 'EXCEPTION PSL ] Except for current mu mode. CREATE PSL: CURRENT MODE= KERNEL CORRECT PREVIOUS MODE iPLO i RESTORE ORIGINAL IPL i GET EXCEPTION PC/NEW PSL N | } REPLACE EXCEPTION PC WITH ADDRESS OF EXESROPRAND Go report the Exception. i C= MRIDAZ-0729 VMS Machine Check Handler Flow Chart C-28 (Part 25 of 50) VMS MACHINE CHECK HANDLER (VMSMCK) FLOWS CP__10_¥BUF__PAGEFAULT This is the Translation Not Valid Fault exception hendler used when retrying CP__IO__BUF errors. KERNEL stack on entry is: 00(SP) = FAULT PARAMETER LONGWORD 04(SP) = SOME VIRTUAL ADDRESS IN FAULTING PAGE 08(SP) = EXCEPTION PC 12(SP) = EXCEPTION PSL CP_10_BUF__ PAGEFAULY \ RESTORE ORIGINAL SCBB PUSH RO-R5 ON STACK ERROR LOG BUFFER AVAIL 1 GET MASK OF RETRY ABORTBITS | GET RETRY COUNT 1 | RELEASE ERROR LOG BUFFER - 1 POP RO-R5 | OFF STACK ZERO EXCEPTION PSL 1 - Except for CURRENT ACCESS MODE. ~ CREATE PSL: CURRENT MODE= KERNEL - CORRECT PREVIOUS MODE IPLO RESTORE ORIGINAL IPL GET EXCEPTION 1 REPLACE EXCEPTION l PC WITH ADDRESS OF MMGSPAGEFAULT Go report the exception. C= VMS Machine Check Handler MRIOE70728 Flow Chart (Part 26 of 50) MACHINE CHECK HAND - (VMSMCK) LER FLOWS CP_10__BUF_ACVIOLAT This is the Access Violation Fault exception handler used when retrying CP__IO__BUF errors. w KERNEL stack on entry is: 00(SP) = ACCESS VIOLATION REASON MASK 04(SP) = ACCESS VIOLATION VIRTUAL ADDRESS 08(SP) = EXCEPTION PC 12(SP) = EXCEPTION PSL CP__I0_BUF__ : ACVIOLAT V RESTORE ORIGINAL SCBB 4 PUSH RO-R5 ON STACK y NO ERROR LOG BUFFER AVAIL GET MASK OF RETRY ABORT BITS |4 GET RETRY COUNT 1 RELEASE ERROR LOG BUFFER - POP RO-R5 OFF STACK ZERQ EXCEPTION PSL I Except for CURRENT ACCESS MODE. CREATE PSL: CURRENT MODE = KERNEL CORRECT PREVIOUS MODE IPLO — VMS RESTORE ORIGINAL IPL GET EXCEPTION PC/NEW PSL , * REPLACE EXCEPTION Go report the exception. PC WITH ADDRESS " OF EXESACVIOLAT =D VMS Machine Check Handler Flow C-30 Chart MRA1O87-0727 (Part 27 of 50) VMS MACHINE CHECK HANDLER (VMSMCK) SETUP__RETRYSCB This routine sets up an appropriate SCB for retries of CP__IO__BUFFER errors ( SETUP_RETRYSCB ) GET SCB PA \ YES VALID ADDRESS [GeT SPT BASE | lcwnmmws l ALIGNED), VPN, PTE LOAD VECTORS: MACHINE CHECK KERNEL STACK NOT VALID RESERVED OPERAND 1 = ) MRIOBT0722 VMS Machine Check Handler Flow Chart C-31 (Part 28 of 50) FLOWS VMS MACHINE CHECK HANDLER (VMSMCK) FLOWS FBOX__SERV FBOX errors that are too frequent will cause the MACHINE CHECK HANDLER to turn off the FBOX and report this to the console. This is not a reason to bugcheck as EBOX can assume floating point functions. ( FBOX__SERV ) ] INCREMENT FBOX COUNT L NO 3 ERRORS IN 20 MS | TURN OFF FBOX SEND MESSAGE TO CONSOLE | SAVE TIME OF ERROR C =) MRI08T-0T34 VMS Machine Check Handler Flow Chart (Part 29 of 50) C-32 ; 5, VMS MACHINE CHECK HANDLER (VMSMCK) FLOWS EBOX__SERV An EBOX error has been detected. These are all treated alike. This is cause for a bugcheck if they are too frequent. ( eosem ) l N0/ opBUS_PE = 1) (EDPSR<6> MBOX INTERRUPT Yes (EBCS<14>=1) A INCREMENT EBOX ERROR COUNT : | 1 YES / ‘ \ 3 ERRORS IN20MS | oFERROR. ( BUGCHECK__POP ) ' 2045 =D MRIDG 70748 VMS Machine Check Handler Flow Chart C-33 (Part 30 of 50) O\XVIS H S30Md39vd _ \~340MNIVINS.—mw»1dMY.ILNIvBY3ITDNI3id.||A ~iNXOO3NA3VN1OI8Vl5IHddO-d4HfOlO3H49dvdYr2 V09NH—INfDi0101m ONSuN_=i<d1043Wz09>I32H2WIOH3MHa0NHWvYI)De{tA AV'HSYI.HSO0JIUHJA/Y713 - 3113391N3d0.394v3d4 _r 1391SITSM HvSA1WHY3O3IUIMAHT3IOaDIY3QMOHYLMVHIILNTYSJO0N1E8_ASKAS3X9BYvdYUavTefaYiOPWYO8YDI3TP_AUVHeHIOYB8VSHI(H1A9OVMYO\TJ3Iaeayd32404g3syaHOBeYd)—1€m113hd3wA9OLlN0_Ai.VdzHaIUud(YA4L00S YENVRYTY ¥e 3 1HO8Y \ON {3auvoan ] | M o C-34 i1SI3NTS9Mm | 139 Nid. 3dAL 4 139 Nid. 3dAL | .139AvHNudY.$SXISM38AAV BYITD QYA L8 fi 139 Nid.. G3d vIN OL Nid 3sve \ S IWAOVILd .3AOVNH3HYI4S3HILHAN YO 139 Nd ., 31ViS fi J1ISVEYivaI8vD01VIONddd AJIO W ISALdHtOIiM}G HANDLER «(VMSMCK) ONV §SY NOVE HLOViOVdgOud | CHECK Nid Vivd \ 3A08Y (1301Sv HY31D A4IGOW §11ss10UpoSdeARmaLuEBAdyod1JoIRaHy)IRaYbedfe5d1ispAabed“ut P oyepeqmiep10158Aldusis5186eMauAJ0D80UiS vNI3S1Sd330101H3d80vd {3 ) Nv SIYNONSOY WOH4RHYI»W /ON ) 1H3O8IYAOJVIAOOrMW%S :— —SISLNUI3NOVARIDVAALYIANYYHAYNIY M W.M:MM%18 ON 39w_3H18v1S3A\ WOS1ISM ave W3Iw VMS MACHINE FLOWS uADINLINISIUSOWN a .»«Sx3HjOAY oy 0H¥JO39H4I 010480190 VMS MACHINE CHECK HANDLER (VMSMCK) FLOWS MBOX 1D These are not directly related to instruction execution. From An MBOX error was reported when (PL dropped below 1D. or DMA errors. here we will dispatch to either asynchronous CPU errors ed the interrupt at 1D mechanism are The eight MBOX errors reportvia — CPR PARITY ERROR (This is a fail sale, normally reported via MBOX FE) — ECC ERROR (CP or DMA) — CACHE W BIT PARITY ERROR (CP or DMA) — CACHE TAG PARITY ERROR (CP or DMA) — CACHE DATA PARITY ERROR (CP or DMA) — CP WRITE PARITY ERROR (CP reference) — ABUS BAD DATA CODE (CP reference) — MBOX DETECTED ABUS PARITY ERROR (DMA) they are serviced here. They do not latch the cycle TB errors do not set either MBOX FE or MBOX INTERRUPT. However type bits in MSTAT1. < MBOX_1D_SERV ’ 1 GET PAMM DATA WITH ARRAY | IN ERROR w NO ---——(w VALID SLOT ) 1 SAVE MEMORY TYPE AND SLOT - 1 YES/' MULTIPLE ERROR \ (MSTAT2<7>=1) 1 CPR PE (MSTAT1<23> OR <22>=1) YES if the machine check is for some other mwmmflwmwwfinmw WMmmM%mmMam \NO ( TBERRORS )— INCREMENT T8 ERROR | COUNT MBOX \ 1D INTERRUPT (EBCS <14>=1) \ YES / GET MBOXCYCLE TYPE YES / \ (BUG_CHECK__POP ’ | (MSTAT1 <29:26>) 1 3 ERRORS IN 20 MS ( SAVE TIME OF ERROR ISIT 1 ABUSCYCLE k ) = \, YES / ( MBOX_1D_CPU ) ==> MRTOBTO760 VMS Machine Check Handler Flow Cchart (Part 32 of 50) VMS MACHINE CHECK HANDLER (VMSMCK) FLOWS MBOX__1D__CPU MBOX 1D error triggered by a CPU reference. When BAD DATA is written, the Bugcheck only if the error rate is error i8 ignored until someone reads When BAD DATA is consumed, the assumption is that it will be used R3 = CYCLE TYPE (MSTAT1 <29:2 6>) too high. it. and the error is treated like the MBOX ( MBOX_1D._CPU FE case. ’ 4 YES /' ADDRESS PE ~ \_[MDECC<22>=1 > ) ) Qm?71 <29:26>= REFILL ERROR ' NO /- \ | YES p YES WRITEBACK cag:"r?fi?;gfl t<0>=1) (R3<1=0>=1) y ~ (MDECC<22>=1) ) BAD DATA - J 4 INCREMENT DOUBLE BIT ADDRESS PE (MDECC<20>=1 / 4 \ ( BUGCHECK__POP ) SAVE IN20Mms “MEMORY ERROR" ( CP_..CACHE_._ERRj IN B_MCHK__CODE SAVE TIME ANY ENTRIES vES r IN CRD LOG - orvy ] 3 ERRORS OF ERROR Q \ NO ERROR COUNT YES / \@ _1 BUFFER _) > L CRD__LOG i ) ves / REFILL ERROR | (MSTAT1<29:26> = 1001) L___Yes/ “WRITEBACK (R3<1:0>=1) 4 ( Bab_mem j MR10B7-0748 Machine Check Handler Flow Chart C-36 (Part oy 33 of 50) VMS MACHINE CHECK HANDLER (VMSMCK) CP__CACHE__ERROR Cache errors that maich here are, — CACHE TAG PARITY ERROR — CACHE W PARITY ERROR — CP BYTE WRT CACHE DATA PE - CACHE DATA PE (CPMCACH E__ERROR ) { WRITTEN BIT OR ws) ERROR (MSTAT2<6> OR <5>=1) CACHE DATA ERROR (MSTAT1<3> OR <1> =1) o ¥ I INCREMENT / I CACHE ERROR COUNT ( CP_WRITE__PE ) CACHE ERROR (MSTAT1<2>=1) NO 1 ——-—-( ?stflm“g TURN THAT v ) If errors are too frequent, turn off the appropriate cache. " CACHE OFF SAVE TIME OF ERROR C =) MR1087-0750 VMS Machine Check Handler Flow Chart C-37 (Part 34 of 50) FLOWS g VMS MACHINE CHECK HANDLER (VMSMCK) FLOWS CP_WRITE__PE . memory arrays. The data CP WRITE PARITY ERROR reported via MBOX 1D interrupt occurs for those writes destined forsynchronous the failing ‘ has been written along with the BAD DATA ERROR flag. Because the PC and this error are not instruction cannot be retried. When the BAD DATA is read, that USER (or the SYSTEM) can be aborted at that time. NO /“\WRITE DATA PE ‘ \ (MSTAT1<9:4>=F) (W.ABUSMBADWDA}D v P! e \ INCREMENT CP WRITE PE COUNT YES/7 3 ERRORS \ ‘ IN 20 MS ' | SAVE TIME ‘ 2 MR1087-0757 VMS Machine Check Handler Flow Chart C-38 . {0 (Part 35 of 50) VMS MACHINE CHECK HANDLER (VMSMCK) FLOWS CP__ABUS__BAD__DATA: %SMM?AfimMWMWMSMAWuMaHfiAflMTAmeuWKdelmmsm u (RDS). ‘I‘MAMWMTAEHWMWMWM&WWwM:REAODATATAGMUWWMMIMW the CPU and probably resulted in an Ebox "B OPBUS” error, EDPSR <6>. by med The DATA was consu S8i (RDS). CP_ABUS__ BAD__DATA %" mfi mmm GET MICROCODE PHYSICAL PAGE LEVEL REV FROM EBCS L " { UNIBUS ADDRESS \ 1 10 READ CLEAR IN ! L YES/N SRev - #t ! BAD DATA ) NO \/ ABUS : T2<14> - =1) (MSTA ( MISE_MBOX_10 o | | ERROR COUNT D INCREMENT MBOX ) ) W,... ( | ) RSB 1 " BMUX OPBUS PE N0/~ = 1) (EDSPR<B> OP PORY NO/ BEING SERVICED GET OP PORT VIRTUAL ADDRESS 1 GET SPT BASE ADDRESS ' 1 (VA<31>=1) l GET POBR 1 / l L [ SPACE BIT - CLEAR SYSTEM ‘ 1 YES PO SPACE > =0) (POBR<30 l | GET P1BR 1 1 MRIOBT-081 VMS Machine Check Handler Flow Chart (Part 36 of 50) C-39 VMS MACHINE CHECK HANDLER (VMSMCK) FLOWS MISC__MBOX_1D This routine handies any MBOX error not__ caught by1D previous MBOX machine checks. ( MISC__MBOX_1D ’ y INCREMENT MBOX1D ERROR COUNT YES / \ ; 3 ERRORS IN20MS - ) ) ( BUGCHECK_POP ) OF ERROR SAVE TIME ‘ RSB ) MRI087.0732 VMS Machine Check Handler Flow Chart (Part 37 of 50) g VMS MACHINE CHECK HANDLER (VMSMCK) CACHE__OFF This routine is called from CP CACHE ERROR and DMA ERROR. The purpose is to turn off failing half of cache if error ( CACHE__OFF ) l [ GET CURRENT CACHE STATE FROM CSWP l YES / CACHE O FAULT \ (MSTAT1<2> =0) _ 1 TURN OFF CACHE O . l 1 CACHE 1 FAULT I (MSTAT1<2>=1) [T TURN OFF CACHE 1 = ) MRIOBTOI6Y VMS Machine Check Handler Flow Chart C-41 (Part 38 of 50) FLOWS iy, VMS MACHINE CHECK HANDLER (VMSMCK) FLOWS MBOX__1D0__DMA MBOX 1D error triggered by an ABUS reference. In general, these are non-fatal. Bugcheck only if the error rate is too high. Whenever bad data is written, the error is ignored until someone reads it. e Errors on reading are handled by device drivers. R3 = Cycle type (MSTAT1 <29:26>) . e Uy, ( MBOX_1D_DMA ) O IS SBIA STATE YES SAVED IN B_MCHK_CODE GET I0A NUMBER (MSTAT1<17:16> Q No / BUG__CHECK type is MACHINECHK, severity is fatal. IS THIS AN SBIA ) L] GET SBIA BUG__CHECK BASE ADDRESS Go save SBIA registers and RSB back. ( REGISTERS , SAVE_SBIA__ SBIA ERROR SUMMARY SAVED IN B__MCHK__CODE ( DMA_,ECCWERROR) MR1087.0725 VMS Machiné Check Handler Flow Chart (Part 39 of 50) C-42 s VMS MACHINE CHECK HANDLER (VMSMCK) DMA_ECC__ERROR ADDRESS PARITY ERROR — If ABUS REFILL just exit, the driver will do the rest. MTAWMEWMWMTAERW — just log them, the device driver will do the rest. For DATA DOUBLE BIT put the page on the bad page list R3 = CYCLE TYPE (Mfi'mfi <29:26>) (UMA___ECCWERMH , YES / AWESS l NO{ \ \ p— (MDECC<19>=1) I REFILL ERROR YES / BAD DATA (MSTATI<1>=1) ERROR FLAG {MDECC<22> =1) [ | YES / \ Gy INCREMENT ERROR COUNT | YES / CACHE ERROR \ (MSTATI<3>= 1) 3 ERRORS i IN 20 MS DATA DOUBLE "0/IT ERROR L \(MDECC<20> -0) (=) ) > : MCHK CODE IS MEMORY DBE i GET PHYSICAL PFN OF ERROR 1 NO/ PFN DATA \, D), | BASE FOR PAGE , DMA CACHE MARK PAGE BAD N0< CRD ERRORS _ i > LOG CURRENT CRD ERRORS ‘ K BAD__MEM ) MRTOB70740 VMS Machine Check Handler Flow Chart (Part 40 of 50) C-43 FLOWS VMS MACHINE CHECK HANDLER (VMSMCK) FLOWS DMA_CACHE_ERR Cache errors that reach here are — CACHE TAG PARITY ERROR — CACHE W PARITY ERROR - CP BYWRT CACHE DATA PE — CACHE DATA PE iy, Check for too many and turn off cache if necessary ( DMA_CACH E__..ER!;) DATA PE or BYTE WRT PE set in MSTAT2 CACHE TAG PE \ (MSTAT2<6> OR / OR CACHE WBIT PE \ NO <s>=1) CACHE WBIT R NO SET (MSTAT2<4>=1) CACHE TAG PE WITH WBIT SET (MSTAT2<6> =1) \ NO ( BUGCHECK__POP ) NO/ \ CACHE DATA ERROR ) L ! INCREMENT =3 CACHE ERROR COUNT l oy i ) YES/ 3 ERRORS \ Comeo) IN20Ms SAVE TIME OF ERROR = ) MR1087-D7584 VMS Machine Check Handler Flow C-44 Chart (Part 41 of 50) VMS MACHINE CHECK HANDLER (VMSMCK) DMA_ABUS__PE ABUS parity errors detected by MBOX 1D interrupts come from a DMA write cycle with a CONTROL parity error and the first write overlapping a REFILL. R3 = CYCLE TYPE (MSTAT1 <29:26>) ( DMA_ABUS__PE ) V l /ABUS DATA PE YES /' OR ABUS C/M PE (MSTAT1<21> OR \ <20>=1) ( BUGCHECK__POP ) ( MISC_MBOX_1D ) MRIOB7-0731 VMS Machine Check Handler Flow Chart C-45 (Part 42 of 50) FLOWS VMS MACHINE CHECK HANDLER (VMSMCK) FLOWS SBI VECTORS SBI ALERT ERROR and FAULT interrupts are handled here. All interrupts cause a full SBIA log. SBI FAIL is treated like power fail and is handled elsewhere. The SBIA has a micro-sequencer that is used for CPU operations to the SBI. Its control word is parity protected. If a parity error is detected while the CPU is making a reference to the SBI then it will be reported with MBOX fatal error and CP__IO_BUF. When the sequencer is not in use for CPU references it loops on a single control word. If a parity error is detected in this case we come here via an interrupt, Stack on entry: pointer to SBIA base address PC.PSL pair EXESINTS8:: SBI ALERT C SBI VECTORS ) EXESINTSEC:: SBI FAULT EXESINTSB0:: SBIA ERROR PUSH RO-R6, AP ON STACK FRAME | GET SBIA BASE ADDRESS SAVE_SBIA . Go save SBIA state REGISTERS ( LOGSS!I and RSB back. _ ) and RSB back. Go log SBIA state INCREMENT SBlA ERROR - COUNT ] YES / \ 3 ERRORS IN 20 MS y SBIAERROR, severity C BUG_CHECK ) BUG_CHECK is OF ERROR SAVE TIME 18 fatal. POP RO-R6, AP OFF STACK FRAME ) = MRAI087.0732 VMS Machine Check Handler Flow Chart (Part 43 of 50) VMS MACHINE CHECK HANDLER (VMSMCK) MCHECK This routine formats information for error logging and checks if a machine check recovery block has inhibited logging. IMPLICIT INPUTS: (AP): points to machine check log on stack LOG__MCHK OUTPUTS: Ervor is formatted and logged in system error log. ummmb\:fiw.wmmwimmaumflnfla ) \ GET PC, PSL POINTER AND MASK IN R1,R2 I SET LOG BIT INHIBITED LOGGING (RO<0>=1) [ GET ERROR | LOG BUFFER l \ ves [/ | AVAILABLE N % BUFFER SAVE BUFFER ADDRESS ON STACK NO / \ INCREMENT MACHINE CHECK ERROR COUNT . MEMORY ERROR ‘ INCREMENT EXESGL_MEMERRS FRAME IS SBIA ERROR SUMMARY (N B__MCHK__CODE U YES O MRIOBT-OTEE VMS Machine Check Handler Flow Chart (Part 44 of 50) C-47 FLOWS VMS MACHINE CHECK HANDLER (VMSMCK) FLOWS GET LOCAL COPY OF ABORT __BITS [ INITIALIZE RETRY COUNTER RELEASE BUFFER TO ERROR LOGGER e ( IS SBIA LOG IN B_MCHK_CODE \ NO / GET PC/PSL POINTER ( LOGSBI ) ( RSB ) S MRIO870737 VMS Machine Check Handler Flow Chart (Part 45 of 50) g VMS MACHINE CHECK HANDLER (VMSMCK) FLOWS LOGSBI ers and related information. This routine is used to log the following SBIA (I0A) and SBI regist comp CNFGR DCONTROL DMAACA MCHKSGL__LOG DMAAID : The error is formatted and logged in the system error log. If there is no error log buffer, return with error status in RO. ; INPUTS: R3: error type code R4: length of frame RS5: address of frame u : Error is formatted and logged in system error log. i no error log buffer, return with error status in RO. RO—RB5 destroyed. DMABCA DMAGBID DMACCA DMACID DMAICA DMAIID ERROR FAULT MAINT SILO1 STATUS SUMMARY TMOADDR INPUTS: of PC, PSL R1 -> Address G =D , o 1 ! S oR [ GET SIZELOGOF ERROR BUFFER REQUIRED BUFFER - LOG \ LOG BUFFER | GET ERROR| “ L | LOG BUFFER GET ERROR \ BUFFER AVAILABLE {RO<O>=1) NO/ L ‘ N l GET MCHK REASON CODE1 | , AND SILO ) q RELEASE BUFFER | LOGGER RSB _ 1 ceTsBiA | GETSBIA TO ERROR ( ARUNLABLE GET PC.PSL ‘ l SEND LOG ) ENTRYTO LOGGER ’ ERROR J MRIOBTHTIE - ) = MRIOBTO742 VMS Machine Check Handler Flow Chart C-49 (Part 46 of 50) VMS MACHINE CHECK HANDLER (VMSMCK) FLOWS Single Bit (CRD) error handling This routine logs single bit errors. {16) errors are accumulated before an error log entry is made. (3) errors in 1 second causes reporting to be turned off for 5 minutes. o, SINGLE BIT To run with CRD interrupts turned off can cause » miss on double-bit errors. Only logging will be turned off. (CRD) ERRORS ] / \, CRD LOGGING DISABLED INCREMENT TOTAL CRD ERROR COUNT REI / INCREMENT CRD ERROR COUNT IN BUFFER GET NEXT SLOTIN CRD LOG BUFFER ISIT INITIALIZED 4 GET FIRST SLOT IN CRD BUFFER AND INITIALIZE (T o \ SAVE MDECC, MEAR MSTA1.,AND MSTAT2 GET MEMORY BOARD WITH ERROR NO IS THIS VALID SLOT 4 | SAVE MEMORY TYPE AND SLOT NUMBER e YES /" \ 3 ERRORS IN 1 SECOND i [ SAVE TIME ] OF ERROR MRIDET-0714 VMS Machine Check Handler Flow Chart (Part 47 of 50) VMS MACHINE CHECK HANDLER (VMSMCK) FLOWS 1 < IS BUFFER FULL \ NO 1 CRD error reporting is off for & ( f CRD_OFF wenedminutes. , GET CURRENT ERROR REPORTING | FROM MERG < DISABLE CRD REPORTING IN | MERG ‘ | | BUFFER AVAILABLE { RO<O> =1} 1 RSB SEND LOG REPORTING DISABLED | 1 =D \ ) NO ‘ ~ UPDATE CRD__FLAGS TO INDICATE CRD ERROR LOG BUFFER : A , | GET ERROR (MERG<10>=1) ‘ l FFER S oy R \ ; DISABLED . Log the current buffer ERROR LOG BUFFER SIZE CRD REPORTING ; / and cleer it for reuse. 1 SET UP | \ ; 1 , ( CRD_LOG ) ENTRYTO ERROR LOGGER LET "SHOW ERROR" | SEE ALL ERRORS LOGGED INITIALIZE CRD IN COUNT | BUFFER \ SET UP BUFFER TO POINT TO FIRST | SLOT AND INITIALIZE IT =) MRIOB 70718 VMS Machine Check Handler Flow Chart (Part 48 of 50) C-51 VMS MACHINE CHECK HANDLER (VMSMCK) FLOWS REENABLE TIMER This routine is called by the system clock routine. If CRD reenabled if allowed by the SYSGEN parameter. reporting is disabled end the timer has expired, reporting is Routine does not turn off interrupts, only logging. REENABLE TIMER YES/ CRD REPORTING DISABLED i NO CRD TIMER EXPIRED 4 RESET CRD TIMER i NO CRD ENABLE FLAGON CLEAR "INHIBIT ARRAY SBE REPORTING” FLAG (MERG<10>) ) ] l UPDATE CRD__ FLAGS TO INDICATE REPORTING ENABLED / —\ ( KAF TIMER ExPReD l ) 3 RSB ) ( CONSKEEPALIVE ) MRI08-0116 VMS Machine Check Handler Flow Chart C-52 (Part 49 of 50) VMS MACHINE CHECK HANDLER (VMSMCK) FLOWS CONSKEEPALIVE This routine is called every 90 seconds to read the TODR register. The purpose is to determine if the console is functioning normally by updating the TODR. if the value is unchanged, the console is dead or hung and 8 REBOOT is attempted. ( CONSKEEPALIVE ) l READ TODR TODR \NO TODR > INTALIZED ] | NO / \ o K | SET UP SPACE FOR ERROR LOG BUFFER | RESET | KAF TIMER BUFFER l AVAILABLE ENTRYTO ERROR LOGGER | SAVE TODR CONTENTS - ' 1 SEND LOG UPDATED | ' GET ERROR LOG BUFFER NO / = ) I MAI0BT-0763 VMS Machine Check Handler Flow Chart (Part 50 of 50) o i o & g " APPENDIX VAX8600/8650 KEEP ALIVE FAIL (KAF) D FLOW AND SNAP FILE DESCRIPTION INTRODUCTION The Keep Alive Fai 1 (KAF) mechanism provides a method for the Console to monitor the over all activity of the VAX8600/8650. If the Console determines that th e EBox has not entered IRD Time during the last 300 milliseconds it declares a Keep Alive Fail Condition and Snapshots (captures) the state of the system. The Snapshot process involves controlling (See Figure the CPU Clock D-1 VAX8600/8650 (ON/OFF) and Keep Alive capturing the Flow) state of: 1. The SDB visibility channels (Tables 3 and 4). 2. The console 3. The EMM 4. The EBox scratch pad RAMs (Table 7). 5. | The CPU i nternal (Table 8) 6. The contents of 7. The last 64 Longwords that were pushed on the Interrupt Stack, the 25 longword machine check stack frame (if it is on the interrupt stack after the top 64 longwords), and the bottom 64 longwords on the interrupt stack (Table 10). 8. The major- ABus adapter registers (Table 11). 9. The clock status, (Table 12 ). control c ontrol and and status status (micro, registers registers macro, (Table (Table and 5). 6). miscellaneous) registers » the PAMM clock (Table (SBIA) 9). and alignment, nexus and a control and 25 microstep uPC status trace The overall organization and format of a snap file is illustrated in Figure D-2). Note that each record, excluding the master header record, has a standard SNAPSHOT (data) header (Table 2). VAX8600/8650 KEEP ALIVE FAIL (KAF) After the master data has header been record collected (See Table FLOW AND SNAP FILE DESCRIPTION the 1) and snap shot the console 1is appended checks the to the VERIFY (or NOVERIFY) software switch to determine if it should reload and verify the contents of the VAX8600/8650 RAMs. 1If the switch is set the console will reload all the RAMs and verify their contents. Errors detected during the verify stage will be recorded in the master header record. If the switch is not set, the console will skip the reload and verify step and proceed as follows. The console will write the snapshot RLO2 (SNAP1.DAT or SNAP2.DAT). information to a snap It will then attempt file on the to reboot the system. | NOTE If both SNAP1.DAT and SNAP2.DAT files are already valid (two snap shots have been taken without VMS transferring them to ERRSNAP.LOG) nelther SNAP1.DAT or SNAP2.DAT will be written. If the the system VMS reboot side is successful, of the the SNAP file system will and " SYSSSYSROOT: [SYSERR]ERRSNAP.LOG:n. The version incremented each time a new SNAP file is written. If the console is unable to reboot SNAPSHOT file (SNAP2.DAT) and than SNAP FILE Following was the tables by there are VSRBLD. SYSSSYSROOT: [SYSEXE] of The readable the transferred written number (n) system it will build a attempting to reboot the loop - to to is second system. EXAMPLES produced more the be SDB second and directory. self Visibility example two example printouts. VSRBLD was runs under VMS VSRBLD translates explanatory format. translation is generated using To save The and is first example located ERRSNAP.LOG space, only command "SHOW in into a the a sample included. the Console SNAP". This example is highlighted to identify the data headers and other key information described in the Tables 1 through 12. The purpose of this example 1is to help with a SNAPSHOT file you quickly locate 1n this format. Finally, there is a list of 15). The text preceding signals system. can be used information when you are working key SDB visibility signals (Tables 14 and the 1list explains that the state of these to manually evaluate the state of a hung or stalled VAX8600/8650 KEEP ALIVE FAIL (KAF) REAL USMI TIM TIME) IRD (IRD CONSOLE EBOX : CPU KEEP ALIVE U | | I T3C FLOW AND SNAP FILE DESCRIPTION ] ADALG 4 CPU ALIVE F/ M GETCSL: BUILD CSL DATA HDR X MCSR2 <06>> CLOS mL—m ALt 1s — CSL DATA BUF MCSR2 CLK = W I ; CSL REGs CSL DATA BUF | 4 MARK CSL DATA EXECUTE HDR VALID | T11 INSTRUCTION GETEMM: | l DATA HDR IRD TIME BUILD EMM ALL 1s — PROCESS INTERRUPTS EMM DATA BUF EMM REGs — EMM DATA BUF 1 MS INTERRUPT I KAF TIMER ENABLED ERROR MARK EMM DATA HDR VALID READING EMM REGs INCREMENT KAF TIMER i ASSERT N ; KAF TIME UNHANG RESET = 300 BURST CPU CLOCK Y / MCSR2<06> DEASSERT CLEAR MCSR2<06> UNHANG RESET SNPE I csm.sTATUS T —~MHR START CSM AT 100D CLOCK STATUS —MHR | i GETSDB: STOP CPU CLOCK | | SETSDBVIS CHAN = O | KEEP TRACK OF | VISIBILITY CHANS READ ALL 5 MICRO PCs | PRINT KAF IN PROCESS {sosiD — SDB DATA HOR | SEE MHR | TABLE 1 -l pr——————— | AND REASON | BYTE 14 LOOKUP: Y FULL Y / SNAP2.DAT \ N [ ~ PRINT "NC FULL CDF FILE NAME pr— SNAPY.DAT / \ N SDB DATA HDR | READ SDB SELECT | VIS CHAN SNAPL.DAT W DATA HDR VALID FULL SNAP AVAIL" | Y RENAME RECORD LENGTH — SNAP 2 TO 1 SNAP SDB DATA HDR W 1 BURST CPU CLK - MM DEASSERT UNHANG RESET | MARK SDB VALID CDF FILE NAME 4 / SNAP2.DAT \ N , START CSM AT 100D KAF REASON CODE -~ RO ( RETURN ) INCREMENT SELECT SNAP2. DAT e MARK MHR AND ALL DATA HDRS INVALID (FF) SDB VIS CHAN 1 N SDB VIS SET SDB VIS CHAN = 20 i VAX PROGRAM NAME — MHR W‘ | MCC SHIFT PATH —~ MHR Y CHAN = 16 1 | DECREMENT R3 ® SAR LY B S0Y Figure D-1 VAX8600/8650 Keep Alive Flow D-3 (Part 1 of 2) VAX8600/8650 KEEP ALIVE FAIL (KAF) FLOW GETISP: AND SNAP FILE DESCRIPTION BUILD ISP MARK MHR DATA HDR VALID W I AlLL 1s ISP DATA ] ALIGNMENT uPC TRACE upc ASSERT UNHANG RESET p— CLOCK LWDS FROM INT STACK READ AND SAVE MCC SHIFT PATH I DEASSERT START CSM AT 100D 1 ERROR Y AND READING N INT §TK RESTART CSM m REBOOT | SYSTEM MARK ISP DATA HDR VALID ENABLE fc ] l ABORT vMS RUNNING BUILD SBIO I AND SBI1 SNAP1.DAT SYS$SYSROOT: |SYSERR] ERRSNAP.LOG:n DATA HDR BUILD ESC ! DATA HDR ALL 1s | SBIO B S8BN SNAP2 DAT SYS$SYSROGT. DATA BUF ALL Ts ESC DATA BUF ;| i ESC 0 -255 ESC DATA BUFF SBIO AND ISYSERR] NEXUS REGs ERRSNAP.LOG:n N ERROR READING SBIO/NEXUS AHCGTH MARK SBIO DATA HOR VALID N «» N MARK ESCDATA SNAP2 DAT it i i ERROR EADING ESC y Y VALID SBIO DATA BUF i HDR VALID AN | GETCPU: BUILD CPU SBI1 AND DATA HDR ALL 1s j UNHANG RESET ISP DATA BUF UNHANG gy — BURST CPU HEX), AND LAST 64 RECORD — GETESC m FIRST 64, MIDDLE 25 (STARTING FROM 58 | - WHITE SNAP n DAT BUF CHECK CLOCK NEXUS REGs SBI1 DATA BUF » CPU DATA BUF I v i ERROR READING CPU IPRs SBI1/NEXUS * CPU DATA BUF MARK i SBI1 DATA ERROR HOR VALID READING CPU IPRs ] Y VERIFY N SWITCH SET MARK CPUDATA HDR VALID . GETPAM: ECS i - RAM 1D BUILD PAMM - DATA HDR TM 5 . LOAD RAM {(RAM 1D} ALL 1s PAMM DATA BUFF VERIFY RAM ! LOCATIONS PAMM CONTENTS 1 PAMM DATA BUFF ERROR COUNT ~ MHR l ERROR READING PAMM MARK PAMM DATA HDR VALID RAM ID | HCODE LEVELS ~ MHR T O Figure INCREMENT D-1 VAX8600/8650 | Keep D-4 Alive Flow (Part 2 of 2) o VAX8600/8650 KEEP ALIVE FAIL (KAF) MASTER HEADER RECORD | pATA HDR SDB VISIBILITY FOR FBA | DATA HDR SDB VISIBILITY FOR FBM | DATA HDR SDB VISIBILITY FOR MCD | DATA HDR SDB VISIBILITY FOR IBD | bATA HDR SDB VISIBILITY FOR IDP | DATA HDR SDB VISIBILITY FOR ICA | DATA HDR SDB visiBILITY FOR ICB | DATA HDR SDB VISIBILITY FOR CLK | DATA HDR SD8 VISIBILITY FOR EDP | DATA HDR SDB VISIBILITY FOR EBE l DATA HDR SDB VISIBILITY FOR MCC | DATA HDR SDB VISIBILITY FOR MAP FLOW AND SNAP FILE DESCRIPTION | | DATA HDR SDB VISIBILITY FOR EBD | DATA HDR SDB VISIBILITY FOR EBC | DATA HDR SDB VISIBILITY FOR CSB | DATA HDR SDB VISIBILITY FOR CSA | DATA HOR SDB VISIBILITY FOR MTM | DATA HDR CONSOLE C/S REGISTERS | DATA HDR EMM C/S REGISTERS | DATA HDR INTERNAL PROCESSOR REGISTERS | DATA HOR EBOX SCRATCH PAD CONTENTS | DATA HOR PAMM CONTENTS | DATA HDR INTERRUPT STACK 153 LONGWORDS | DATA HDR SBIO/NEXUS C/S REGISTERS | DATA HDR SBIT/NEXUS C/S REGISTERS i DATA HDR CLOCK ALIGNMENT AND uPC TRACE MRA-15058 Figure D-2 KAF Snap File Organization VAX8600/8650 KEEP ALIVE Table 1 | ID Status Code (KAF) FLOW AND SNAP FILE SNAPSHOT Master Header Record Field Name Header FAIL Byte Number Position of | Bytes 00 (1) 01 (1) - Format Description Contains "20" MASTER HEADER Contains one following FF 01 Record Length 02-03 (2) DESCRIPTION Total indicates of the values: = invalid file = valid file number of bytes the master header CPU Program 04-09. ) (6) | Name of the last in record. file loaded by CSL into CPU memory. indicates that VMS was running "VMB" at the time of the name is padded with it is less than 6 characters long. KAF. The NULLs if Note If this field is empty (all null? it indicates that the console was rebooted since LOAD and number of the entire CPU ERROR defined 0 in CSM HALT DEC to console codes STD could forced the bytes SNAPx.DAT not in file. as 032: be run by program the after KAF. Interrupt Stack not valid Non-Ebox double error Kernel mode HALT SCB vector with <1:0> SCB vector with <1:0> but WCS was not loaded. pending error on HALT W (1) Total n 0C (2) n o Code OA-0B W Status | in Bytes 0OJSU o CSM Length WP O File the is "lost. ou last CHMx CHMx with IS =1 vector <1:0> not 0 W the " information VAX8600/8650 KEEP ALIVE FAIL (KAF) FLOW AND SNAP FILE DESCRIPTION Table 1 SNAPSHOT Master Header Record Format (Cont) N S W N A A TN A AN O AN W N S . T A - 0 AN SO GG A - KAF Reason Code Number of Bytes ~ oD N SN S (1) Description A S S S N TN SR SR W WOV S N .S N .A U AR OO AN .A NN I SR A G A A A S Contains one of the following | values: 18 " Byte Position Field Name Parity Error in both ESC A & B. Ebox is looping at UPC 20. 19 = EHM was in the process of handling an error when a second error trap occurred. Ebox is looping at address " Ebox detected a WBus 1A 1C = Non-correctable parity error. EHM is looping at uPC 24. .This will not happen after Console RL02 Rev. 3.0. 1B = CPU ERROR HALT encountered PE 1D = Power system failure. Indicates that the EMM sensed a DC LO condition without a preceding AC LO. lE = Unidentified KAF occurred l1F = MBox/SBIA DMA command error or NXM. Time Stamp Integer 0E-11 (4) VAX formatted 32-bit TOY integer Console PROM Revision 12 (1) Prom revision number. EMM PROM Revision 13 (1) Prom revision number. Minimum EBox IBox MBox HW Ucode Ucode Ucode Revision Revision Revision Revision 14-15 16-17 18-19 1A-1B (2) (2) (2) (2) Note: All 2-byte revs have the following format: | Byte 0 - Major Revision Byte 1 - Minor Revision CPR Ucode Revision 22-23 (2) FBoxA FBoxM ACCESS Ucode Revision Ucode Revision Ucode Revision MCF Ucode Revision CONTEXT Ucode Revision Ucode Release Revision 1C-1D 1E-1F 20-21 24-25 26-27 28-29 (2) (2) (2) (2) (2) (2) VAX8600/8650 KEEP ALIVE Table 1 FAIL (KAF) FLOW AND SNAP FILE SNAPSHOT Master Header Record Format | Byte Field Name EBox Position Number of Bytes DESCRIPTION (Cont) Descrlptlon Ucode Verification 2A (1) Note: MCF Ucode Verification 2B (1) will CONTEXT Ucode Verification 2C (1) of FBoxA Ucode Verification Ucode Verification 2D 2E (1) (1) 0 = None 7255 = Maximum FBoxM FDRAM Ucode Verification 2F (1) IBox IDRAM Ucode Ucode Verification Verification 30 31 (1) (1) MBox Ucode Verification 32 (1) CPR ACCESS Ucode Ucode Verification Verification Verification 33 34 35 (1) (1) (1) 36-40 (11) PAMM MCC Shift Channel Console | - Software Version | 41 (1) Saved 42-43 (2) ~ CLK.ST 44-45 (2) the state of the Software (Major, Minor). Clock MBox Shift and Data) Revision | State Rate number detected ? (Micro Address <15> bytes total errors Console CPU SRR verlflcatlon verification Path Spare All contain (Word) 1/5 (1) or full <14:08> <07:06> <05> <04> <03:00> Calculated CSPE 46-49 (4) size Mark stop Clock runnlng Bit<n> phase clock stopped CSPE syndrome. no CSPE Syndrome ‘Total (0) Frequency in MHz, l if external MBZ = 74 bytes | n if FFFFFFFF if v - VAX8600/8650 KEEP ALIVE FAIL (KAF) FLOW AND SNAP FILE DESCRIPTION Standard SNAPSHOT Record Header Format Table 2 Byte Position Field Name Header 1ID 00 T - h Number of Bytes (1) | Description Contains one of the following codes: 21 = FBA 22 = FBM 2E = EBC 2F = CSB 28 =CLK 35 = ESC 23 24 25 26 27 = = = = = 30 31 32 33 34 MCD IBD 1DP ICA ICB 29 = EDP 2A = EBE 2B = MCC 2C = MAP 2D = EBD - Status Code 01 (1) = = = = = CSA MTM CSL EMM IPRs 36 = PAMM 37 = ISP 38 = SBIO 39 = SBIl 3C = uPC Trace Contains one of the following values: 01 = valid record data FF = Invalid record data - a record could be invalid if, for instance, CSM is not able to provide the console program with the data. Record Length 02-03 (2) Total number of bytes in this CDF filename 04-09 (6) Name of the .CDF file used to record (Header and Data) extract the SDB data contained in this record. CDF files will always be 6 characters. For non SDB records this field will contain an ASCII record name in the form Total size = 10 bytes °‘NAMrec' ) KEEP ALIVE Table 3 FAIL (KAF) FLOW AND)SNAP EILE DESCRIPTION SNAPSHOT Record Format SDB-Visability Byte Position Name mwmmmm“m mmmmmmm“ mflmwmw mu.«wmmm mm b L R R Header Data Byte Data SDB Byte Data Byte SDB Data Byte mM“mmw”m mmmwmfim mmnflmmf luwwummw flm- Standard SNAPSHOT Record Header (Table 2) Bits Bits Bits Bits <11:0> of MUX SEL <11:0> of MUX SEL <11:0> of MUX SEL <11:0> of MUX SEL ENA ENA ENA ENA SDB Data SDB Byte Data Byte Bits <11:0> 31 Bits Total size = 74 <11:0> of MUX SEL of MUX SEL bytes NOTE Only bits records should <11:00> contain be ignored. of each 2-byte wvalid data. D-10 set in the FBA and FBM Bits <15:12> are zero and ENA ENA W W s o3 28 SDB SDB - | Description See W N =0 Record Number of Bytes FBM e Field for FBA and Channels COO0O VAX8600/8650 VAX8600/8650 KEEP ALIVE FAIL (KAF) FLOW AND SNAP FILE DESCRIPTION Tabla 4 sStandard SDB Visability Data Format - (excluding FBA and FBM) ¥, Byte TM i Field Name | > 00-09 | Record Header _ # (10) SDB Data Byte 0 SDB Data Byte 1 SDB Data Byte 2 SDB Data Byte 3 SDB Data Byte 4 SDB Data Byte 5 SDB Data Byte 6 SDB Data Byte 7 SDB Data Byte 8 SDB Data Byte 9 0A 0B oC 0D OE OF 10 11 12 13 (1) (1) (1) (1) (1) (1) (1) (1) (1) (1) SDB Data Byte 30 SDB Data Byte 31 29 2A (1) (1) Description HOT Record See Standard e SNAPS 2) Header (Tabl Bits <7:0> of MUX SEL 0, ENA O0 Bits <7:0> of MUX SEL 1, ENA Bits <7:0> of MUX SEL 2, ENA O Bits <7:0> of MUX SEL 3, ENA O Bits <7:0> of MUX SEL 4, ENA O Bits <7:0> of MUX SEL 5, ENA OO Bits <7:0> of MUX SEL 6, ENA 0O Bits <7:0> of MUX SEL 7, ENA 1 Bits <7:0> of MUX SEL 0, ENA Bits <7:0> of MUX SEL 1, ENA 1 Bits <7:0> of MUX SEL 6, ENA 3 Bits <7:0> of MUX SEL 7, ENA 3 ‘Total“Size = 42 bytes Table 5 SNAPSHOT Redord Format for Console Registers Record Header MCSRO MCSR1 MCSR2 MCSR3 ERSR LRSR RRSR spare QCSRO QCSR1 QCSR2 QCSR3 SID0 SID1 SID2 SID3 RL CTRL Status RL Drive Status Byte Number 00-09 | oA OB 0C, . (10) (1 (1 Position Field Name w Number Position of Bytes | v of Bytes 0D OE OF 10 11 12 13 14 15 16 17 18 19 1A-1B 1C-1D Total size Description 3' See Standard SNAPSHOT Record Header (Table 2) State of Console Register State of Console Register State of Console Register State of Console Register State of Console Register State of Console Register State of Console Register State of Console Register State of Console Register State of Console Register State of Console Register State of Console Register State of Console Register State of Console Register State of Console Register State of Console Load Device State of Console Load Device 30 bytes VAX8600/8650 KEEP ALIVE Table 6 T Byte FLOW AND m mmwwmmwm m he R R Format of ee— Bytes MODOK SWREG PROM revision REGULATOR_A VOLTAGE REGULATOR B VOLTAGE REGULATOR_C VOLTAGE - REGULATOR_D VOLTAGE REGULATOR_E VOLTAGE REGULATOR F VOLTAGE VOLTAGE GND CURRENT VALUE REGULATOR L + REGULATOR VOLTAGE - VOLTAGE REGULATOR_ K + VOLTAGE REGULATOR_K - VOLTAGE Tl TEMPERATURE VOLTAGE T2 TEMPERATURE VOLTAGE T3 TEMPERATURE VOLTAGE T4 TEMPERATURE VOLTAGEY - Byte Field 7 Module Enable Register High/Low Register OK Register OF 10 11 Misc. Misc. 12-13 Measurement of Measurement +5V of Measurement Measurement +5V of +5V of Measurement -2V of -2V Prom 14-15 16-17 18-19 1A-1B 1C-1D l1E-1F 20-21 22-23 24-25 Revision 28-29 Measurement of -5.2V of -5,2V Measurement Measurement 2E-2F 30-31 size Pad = Number Measurement Measurement Measurement 2A-2B 2C-2D EBox Scratch Hardware Status Software Status Measurement Measurement 26-27 Total Table 39 Description Margen Margen OD-0E MISREG L EMM Registers Standard SNAPSHOT Record Header (Table 2) ~ Regulator On/Off State MARHILO H DESCRIPTION See POWREG REGULATOR for FILE b T Re—" Header MARGEN of Ground of +12V current of =12V of +15V of =15V of Input Thermistor Output Thermistor Measurement of Measurement of Measurement Output of Output Thermistor Thermistor 50 bytes Contents (Location 000 thru 255) Elfif' Number Name unmmmmmm Position mmmmmmmm mmummw Description b LB T ——, b Record Header Escratch location Escratch il LR e e—— See 0 ° location 254 32-bit Ebox 32-bit 32-bit Ebox Ebox 32-bit 32-bit Ebox Ebox 2) scratchpad scratchpad scratchpad Record - ram data ram data ram data Escratch location 255 / ¢ Escratch SNAPSHOT (Table o o location N location Escratch Standard Header s - SNAP Number Position mm-wwmnmwm Record (KAF) SNAPSHOT Record Field Name AR FAIL 402-4C5 406-409 Total size D-12 = 1034 bytes scratchpad ram data scratchpad ram data VAX8600/8650 KEEP ALIVE FAIL (KAF) SNAPSHOT Record Format for IPR Register Table 8 Byte Field Name Record Header - ' KSP - ESP SSP USP ISP POBR POLR P1BR PILR SBR SLR PCBB SCBB IPL ASTLVL | : , ~ ‘ Number Postion of Bytes 00-09 (10) 0A-0D | | 0E-11 12-15 16-19 1A-1D 1E-21 22-25 26-29 2A-2D 2E-31 32-35 36-39 3A-3D 3E-41 42-45 (4) (4) (4) (4) (4) (4) (4) (4) (4) (4) (4) (4) (4) (4) (4) SISR 46-49 (4) ICCS ICR TODR RXCS RXDB 4A-4D 4E-51 52-55 56-59 5A-5D (4) (4) (4) (1) (4) TXCS | ACCS ~ 5E-61 62-65 66-69 MABEN 6A-6D (4) 72-75 (4) 7A-7D 7E-81 82-85 86-89 8A-8D (4) (4) (4) (4) (4) MERG "‘ EHSR 2o STXCS STXDB *® - | N | | | 76-79 8E-91 92-95 96-99 (4) (4) (4) EDPSR 9% - A2-A5 (4) IBESR ® 15 B2-B5 (4) IVASAV M § BA-BD (4) EMD e« A6-A9 AA-AD B6-B9 | (4) (4) 9E-Al Internal Processor Registers (4) 9A-9D VPCBITS See Standard SNAPSHOT Record Header (Table 2) (4) ' 6E-71 CSWP MDECC MENA MDETL MCCTL .o ;:s‘( (4) SID PAMACC Dascrlptlon (4) PME PAMLOC FLOW AND SNAP FILE DESCRIPTION (4) (4) (4) (4) D-13 Internal Registers VAX8600/8650 KEEP ALIVE FAIL (KAF) FLOW AND SNAP FILE DESCRIPTION * - Table 8 SNAPSHOT Record | Field Name ESASAV ¥ 9% ISASAV M § CPC Number Postion of Bytes BE-C1 | MEDR g g MEAR <w ~ (4) (4) (4) D6-D9 | €3 CA-CD DA-DD (4) CSHCTL gug DE-E1 (4) CSES E2-E5 (4) E6-E9 (4) EA-ED (4) §¢ PSL ‘v, SPADR rV STATE § 9 EVMQSAV ‘. w TM ) N ~ Internal Registers a — (4) D2-D5 IBGPR $TM (Cont) T (4) (4) CE-D1 IPR Register Description (4) C2-C5 C6-C9 o ¢ MSTAT] €@ MSTAT2 @4¢ for Byte | VIBASAV " b Format ~ ’* Miscellaneous Registers | -~ EE-F1 (4) TM F2-F5 F6-F9 (4) (4) TM Total size = 250 bytes Table 9 Field Record PAMM SNAPSHOT Record Format PAMM Byte Name Number Position Header of 00-09 | OA Bytes PAMM location 1 0B PAMM location 2 0C PAMM location 1022 408 PAMM (1) location 1023 409 (1) | (1) (L) (1) size = NOTE Only the other low-order bits are byte of the irrelevant. D-14 &L v | TM See Standard SNAPSHOT Record Header (Table 2) Contents of PAMM Array - 0 Total Description (10) location all | : : 1034 bytes of . PAMM Array | is included, “ - — o — | PAMM data _ = o : Contents TM as -, ION VAX8600/8650 KEEP ALIVE FAIL (KAF) FLOW AND SNAP FILE DESCRIPT | - Table 10 SNAPSHOT Record Format for the Interrupt Stack Field Name Byte Position | Number of Bytes Description 00-09 Record Header | 0A-0D 000 (ISP) 0E-11 004 (ISP) 102-105 248 (ISP) 106-109 252 (ISP) 10A-10D BYTCNT 10E-111 . S EHMST AR 2D P 166-169 PC ' 16A-16D PSL 16E-171 348 (1ISP) 172-175 352 (ISP) 266-269 604 (ISP) 26A-26D 608 (ISP) (10) (4) (4) (4) (4) (4) (4) (4) (4) (4) (4) (4) (4) :3;‘ | See standard SNAPSHOT record header (Table 2) ‘1st longword on top of stackk 2nd longword on top of stac 63rd longword on top of stack 64th longword on top of stack , if any BYTCNT (58) of MCHK stack frame stack frame ond longword of . MCHK g 24th longword of MCHK stack frame 25th longword of MCHK stack frame 64th longword on bottom of stack 63th longword on bottom of stack 2nd longword on bottom of stack 1st longword on bottom of stack Total size = 622 bytes NOTE le RL02 pack The snapshot for release 3.0 of the Conso 64 longwords last the wing Follo . bytes now contains 622 ) are 25 placed on the interrupt stack (top of stack first 64 the locations for the stack frame, then longwords pushed on the interrupt stack. top of the First, 64 longwords are popped off the d. Then, a recor ISP the in d place interrupt stack and count. byte frame stack the 058, 00000 search begins for d in place is If it is found, the 25 longword stack frame these , the ISP record. If 00000058 1is not found ly, the 25 64 Final are left at FFFFFFFF. longwords d place is stack longwords on the bottom of the interrupt in the ISP record. Keep in mind that the search for 00000058 does notthe take ISP place until the top of the stack is already in be cannot rds longwo 25 record. Therefore, the middle d, recor ISP the in ords longw 64 duplicated in the first but they may be duplicated in the last 64 longwords. o~ wvalid If there are less than or equal 64 longwordsordsof for data on the interrupt stack, the 25 longw d on the the machine check and the first 64 longwords pushe stack will be FFFFFFFF. Only the top of the stack is D-15 VAX8600/8650 KEEP ALIVE Table 11 Byte Name Record Position Header 00-09 | Configuration Control/Status Error (KAF) FLOW AND SNAP SNAPSHOT Record * Field FAIL ~ Summary Format FILE for ABus Adapters Number of Bytes (10) 0OA-0D (4) 0E-11 (4) See Stadard Record (4) DMAA Cmd/Addr/1ID Cmd/Addr/ID 1A-21 (8): DMAB Cmd/Addr/1D 22-29 (8) 2A-31 32-39 (8) DMAC 3A-7A (64) SBI 7B-7E (4) 7F-82 SBI (4) Error. SBI Time-out SBI Silo SBI Error SBI Time-out SBI Fault SBI Silo | 2x08 0020 2x08 0028 2x08 0030 Silo | 2x08 Address Status 0034 2x08 0038 (4) 93-96 TR1 (4) NEXUS 97-9A 9B-9E Error (4) Status Status NEXUS Summary Error Fault Status NEXUS Error Summary Error Status NEXUS Summary Error Status '~ Comparator CSR n000 Status 2x0n n034 2x0n n03C 2x0n n000 (4) 'TR2 A7-AA Error (4) CSR Summary AB-AE (4) Error Fault | AF-B2 —B3=RB6 (4) TR3 -{4) NEXUS (4) Fault BF-C2 (4) C3-Cé6 TR4 (4) Error C7-CA (4) Error (4) Fault CF-D2 (4) D3-D6 TR5 (4) Error (4) DB-DE (4) Fault (4) TR6 (4) Error E7-EA (4) EB-EE (4) D-16 | | | | CSR Summary Status | CSR Summary Error | - Status NEXUS CSR Summary Error Fault ~ Status NEXUS D7-DA - - Status NEXUS CB=CE DF-E2 - NEXUS CSR Error Summary Error (4) 0044 n0O08 (4) BB-BE 0040 2x08 2x0n 9F-A2 B7-BA 003C 2x0n Error Fault 2x08 2x08 Summary A3-A6 E3-E6 - 2x08 0010 - 2x08 0018 (4x16.) 8F-92 Error Fault ID ID NEXUS Fault TR6 and and Maintenance. Errex Summary Error | Error Cmd/Addr Cmd/Addr SBI NEXUS Fault DMAB (4) Fault TR5 ID ID 8B-8E Error 000C and Maintanence ~ 0008 and Fault Summary 2x08 2x08 Cmd/Addr Silo NEXUS 0004 Cmd/Addr SBI (4) 0000 2x08 DMAI SBI Status 2x08 DMAA (4) Error TR4 . Control (4) Fault TR3 Summary 83-86 Comparator Error Summary Error TR2 Error Diagnostic 87-8A SBI TR1 Address Status (8) Header 2) Control/Status (4) Cmd/Addr/ID SNAPSHOT Configuration 12-15 DMAC | (Table 16-19 DMAI !513 Description Control Diag DESCRIPTION Status | | 2x0n n0O08 2x0n no034 2x0n n03C - 2X0n no00oO 2x0n n008 2x0n n034 2x0n n03C 2x0n n000 2x0n n008 2x0n n034 2x0n n03C 2x0n n00O0 2x0n no0O0S8 2x0n nO034 2x0n n03C 2x0n n000 2x0n n008 2x0n n034 2x0n n03C VAX8600/8650 KEEP ALIVE FAIL Table Number Position of Bytes TR7 NEXUS CSR Fault Status FB~-FE (4) Fault Status TR8 NEXUS Error Summary Error Fault Status FF-102 103-106 107-10A 10B-10E (4) (4) (4) (4) TR8 NEXUS CSR Error Summary Error Fault Status F3-F6 F7-FA (4) (4) Error Summary Error NEXUS 10F-112 (4) TR9 Error Summary Error Fault Status 113-116 117-11A 11B-11E (4) (4) (4) Error Summary Error Fault Status TR10 TR10 CSR 11F-122 (4) 123-126 127-12A 12B-12E (4) (4) (4) NEXUS CSR Error Summary Error Fault Status TR11 NEXUS Error Summary Error Fault Status 12F-132 133-136 137-13A 13B-13E (4) (4) (4) (4) TR11 NEXUS CSR Error Summary Error | Fault Status TR12 NEXUS NEXUS Summary | Status Error Error Fault 13F-142 (4) TR12 143-146 147-14A 14B-14E (4) (4) (4) Error Error Fault TR13 NEXUS Error Summary 14F-152 153-156 (4) (4) TR13 NEXUS CSR Error Summary Error 157-15A (4) Error 15B-15E (4) Fault Status Error Error Fault ~Fault TR14 ‘Error NEXUS Summary Status Status NEXUS Summary Error Fault Status NEXUS CSR Summary Status 15F-162 (4) TR14 NEXUS CSR 163-166 (4) Error Summary 167-16A (4) Error 16B-16E (4) Fault Status TR15 NEXUS 16F-172 (4) TR15 NEXUS CSR Error Summary 173-176 (4) Error Summary Status 177-17A 17B-17E (4) (4) Error Fault Status Error Fault Total size = (Cont) Description (4) TR9 | Byte EF-F2 Error Summary Error - FLOW AND SNAP FILE DESCRIPTION 11 SNAPSHOT Record Format for ABus Adapters Field Name TR7 NEXUS (KAF) 382 bytes NOTE X = the ABus Adapter Number nn = the Transfer Request (SBIAO0 or SBIAl) Level NEXUS D-17 for the corresponding VAX8600/8650 KEEP ALIVE FAIL (KAF) SNAPSHOT Record Format Field Name Record Header | A CLK Alignment, CLK Alignment, " CLK Alignment, CLK Alignment, Phase Phase Phase Phase Byte Number Position of Bytes 00-09 (10) OA-0B (2) | CLK.ST 3 0 1 2 for Clock Alignment and uPC Trace 0C-0D OE-OF 10-11 Description See standard SNAPSHOT record header (Table 2) CPU Clock State See 01, Table (Word) CLK.ST Clock alignment word, Clock alignment word, Clock alignment word, phase phase phase 12-13 (2) (2) (2) (2) = O W 12 Clock phase N Table FLOW AND SNAP FILE DESCRIPTION alignment word, uPC Trace, Step 1 14-1D (10) uPC Trace, Step ‘uPC Trace, Step 2 1E-27 (10) uPC Trace, Step 2 uPC uPC uPC uPC uPC uPC uPC uPC " uPC uPC uPC Step 3 Step 4 Step 5 Step 6 Step 7 Step 8 Step 9 Step 10 Step 11 Step 12 Step 13 28-31 32-3B 3C-45 46-4F 50-59 5A-63 64-6D 6E-77 78-81 82-8B 8C-95 (10) (10) (10) (10) (10) (10) (10) (10) (10) (10) (10) uPC uPC uPC uPC uPC uPC uPC uPC uPC uPC uPC Trace, Trace, Trace, Trace, Trace, Trace, Trace, Trace, Trace, Trace, Trace, Step Step Step Step Step Step Step Step Step Step Step 3 4 5 6 7 8 9 10 11 12 13 uPC Trace, uPC Trace, Step Step 14 15 96-9F AQ-A9 (10) (10) uPC uPC Trace, Trace, Step Step 14 15 uPC Trace, -uPC Trace, - uPC Trace, Step Step Step 16 17 AA-B3 B4-BD BE-C7 (10) (10) (10) uPC Trace, uPC Trace, Step Step 16 17 uPC Trace, uPC Trace, Step Step 19 20 C8-Dl1 D2-DB (10) (10) uPC Trace, uPC Trace, Step Step uPC Trace, Step 21 DC-E5 (10) uPC Trace, Step uPC Trace, Step 22 E6-EF (10) uPC Trace, Step 19 20 21 22 uPC Trace, Step 23 FO-F9 (10) uPC Trace, Step 23 uPC Trace, Step 24 FA-103 (10) uPC Trace, Step 24 uPC Trace, Step 25 104-10D (10) uPC Trace, Step 25 Trace, Trace, Trace, Trace, Trace, Trace, Trace, Trace, Trace, Trace, Trace, 18 - ‘Total size = uPC Trace, Step 1 18 270 bytes NOTE The CPU the master clock status word is the same as CLK.ST 1in header. The clock signals in Table 13 are checked during each of the four «clock phases. If the signal is incorrect, a bit is set in the clock alignment word for that phase. The bit that will be set can be determined from Table 13. D-18 i CRIPTION VAX8600/8650 KEEP ALIVE FAIL (KAF) FLOW AND SNAP FILE DES 3. The steps, micro uPCs, will read the cons ole then generate 24 reading the uPC after each micro step. FBA, MBox, , The uPCs are read in order; EBox, IBox be 5 words (10 will e ther Therefore, FBM. then byte s) for block in The 1480. For instance, part of line 030 OOSF 9 (Example 2) reads: 0006 0004 0018 FBA, each uPC. uPCs, from left to right are: MBox, IBox, and EBox. Clock State Table Table 13 Expected State oO COQO et =t OO = k= CLK6 PHASE TOA H CLK6 PHASE TOB H PHASE TOC DLY H CLK6 PHASE TOD H -0 it O00O0 el s OO0 O0 o O ot et oo CPU CLK STOP H COO0O0 e e H et B H CLK3 TOA CLK6 PHASE TOB H CLK3 TOC C CLK®6 PHASE TOD H D-19 OO0 H TOD H P TOA H H ot et ot et PHASE TOB A TOC B PHASE COOO CLK®6 CLK3 CLK3 CLK®6 COHO ENWBUS CLK3 TOC H EVPAR CLK3 TOD H ©000O0 C CLK3 TOA H CLK®6 PHASE TOB H 0000 WBUS - TOD H Bit Set A O000 TOA H TOB H TOC H O H i H H S -0 H T Y DLY B H DLY PW B H SN A et DLY B H DLY PW B H W oo mmwmm mmmwmmmmmmmmmwmmwmmm W b et et et Signal Name Phase 301 ot et in FMB, VAX8600/8650 KEEP’ALIVE Table FAIL (KAF) FLOW AND SNAP 13 Clock State Table FILE (Cont) EBEG CLK6 PHASE TOA H EBEG CLK6 0100 H H 0100 0100 H 0100 9 10 EBEG CLK6 PHASE TOB PHASE TOC EBEG CLK6 PHASE TOD 9 9 9 -MCCK CLK3 Tl1A A H -MCCK CLK3 1101 TIB B MCCK H CLK3 T1C A H 1101 -MCCK CLK3 TID C 0010 H 10 1101 10 11 MAPN LD CLK3 T1A C H -MAPN LD CLK3 TIB 0010 -MAPN A H LD CLK3 TI1C B -MAPN H 1101 CLK6 PHASE T1D H 10 11 1101 11 1101 11 EBDF CLK6 PHASE TOA EBDF H CLK6 PHASE TOB 0100 H 12 0100 12 EBDF CLK6 PHASE EBDF TOC CLK6 H PHASE TOD H EBCG CLK6 PHASE EBCG TOA H CLK6 PHASE TOB 0100 H 13 0100 13 0100 0100 12 12 EBCG CLK6é PHASE TOC EBCG H CLK6 PHASE TOD 0100 H 13 0100 13 1011 14 -CSBT CLK6 -CSBT PHASE CLK6 TOA H PHASE -CSBT TOB CLK6 H PHASE -CSBT TOD CLK6é H PHASE TOC H 1011 14 1011 14 1011 14 1011 15 -CSAT CLK6 PHASE -CSAT TOA H CLK6 PHASE -CSAT TOB H CLK6 PHASE -CSAT TOC 1011 CLK6 H PHASE TOD H 1011 1011 D-20 15 15 15 DESCRIPTION VAX8600/8650 KEEP ALIVE FAIL (KAF) FLOW AND SNAP FILE DESCRIPTION EXample 1 - This is what a typical ERRSNAP.LOG looks like after it has been processed by VSR. file illustrated in Example 2 -- Master Header CSM status - §5 KAF reason - 1B = -- Non-EBox = CPU In contrast, (SHOW SNAP). double error ERROR HALT Time stamp 29-JAN-1987 11:48:20.59 Console PROM revision - 37 EMM PROM revision Minimum HW revision Ebox Ucode revision Ibox Ucode revision Mbox Ucode revision FboxA Ucode revision FboxM Ucode revision ACCESS Ucode revision CPR Ucode revision MCF Ucode revision CONTEXT Ucode revision Ucode ReLease revision Ebox Ucode verification MCF Ucode verification CONTEXT Ucode verification FBOXA Ucode verification FBOXM Ucode verification FDRAM Ucode verification IBOX Ucode verification IDRAM Ucode verification Mbox Ucode verification CPR Ucode verification ACCESS Ucode verification PAMM verification MCC shift channel (below) -- 84 02 05 CSL in bytes 20 e MCSRO ‘MCSR1 MCSR2 MCSR3 ERSR LRSR RRSR spare QCSRO QCSR1 QCSR2 QCSR3 SIDO SID1 SID2 SID3 RL RL CTLR STATUS DRIVE STATU AA 00 52 A8 80 61 10 this is the same VAX8600/8650 KEEP ALIVE FAIL (KAF) FLOW AND SNAP FILE DESCRIPTION -- EMM data -- POWREG 3F MARGEN 00 MARHILO 00 MODOK 8FFF MISREG 40 SWREG 02 PROM revision 42 REGULATOR A VOLTAGE 5.051 REGULATOR B VOLTAGE 5.078 REGULATOR C VOLTAGE 5.051 REGULATOR D VOLTAGE REGULATOR_E VOLTAGE REGULATOR_F REGULATOR -2.015 -2.015 VOLTAGE ~-5.244 H VOLTAGE -5.244 GND CURRENT VALUE 70.000 REGULATOR L + VOLTAGE 12.192 REGULATOR L - VOLTAGE -11.954 REGULATOR K - VOLTAGE REGULATOR_K + VOLTAGE 15.316 - Tl TEMPERATURE VOLTAGE T2 TEMPERATURE T3 T4 TEMPERATURE VOLTAGE TEMPERATURE VOLTAGE -- IPRs in VOLTAGE Longwords =15.,409 24.991 26.323 25.324 28.321 -- KSP 80000C40 ESP 00000000 SSP 00000000 USP 00000000 ISP 8056956C POBR POLR P1BR 80000000 00000000 7F802000 P1LR 00200000 SBR 027DC000 SLR 00009000 PCBB | 0270CFAQ % SCBB 027D9400 IPL 0000001F ASTLVL SISR ICCS 00000004 00000000 800000C1 ICR FFFFE3BE TODR 1EAC436A RXCS 000E0040 RXDB 0001008D 00000FCO TXCS ACCS 00008001 MAPEN 00000001 PME 00000000 0483F097 SID PAMACC 26000000 PAMLOC 26000000 CSWP 00000003 D-22 VAX8600/8650 KEEP ALIVE FAIL (KAF) FLOW AND SNAP FILE DESCRIPTION -- IPRs in Longwords MDECC MENA MDCTL MCCTL MERG EHSR STXCS STXDB VPCBITS (Cont) 00060000 0OFBOFF9 00000000 00000000 00000100 EDPSR EBCS CSLINT EDMC IBESR EMD IVASAV VIBASAV ESASAV ISASAV CPC MSTAT1 MSTAT2 MEDR MEAR CSHCTL 00180060 010000C4 000000DD 00000004 00000000 100000004 040C0116 041F4404 00000000 FFFFFFO1 80569598 80328B08 80328AF5 80328B00 80328B00 84000004 00000202 FFFFFFO1 0000007C 00000003 IBGPR PSL SPADR STATE EVMQSAV 0000000C 041F0004 00000006 00000004 8056956C CSES -- 00000000 00000000 00000000 80569594 - FFFFFFFF E box General RO: R4: R8: AP: -- R1l: R5: R9: FP: Scratchpad -- Purpose Registers 00000000 00000000 00000000 00000000 R2: R6é: R10: SP: 00000000 00000000 00000000 8056956C Temporaries: 10: 11: 12: 13: 14: 15: 16: 17: 00200000 00000000 0000400C 00000001 00000000 OOFCOOF9 00000200 00000058 18: 60001800 19: 00000000 1A: 00202000 | SFBYCT EHMSTS EVMQSAV EBCS D-23 R3: R7: R1l1l: PC: 00000000 00000000 00000000 00000000 VAX8600/8650 KEEP ALIVE FAIL 1B: 00000000 EDPSR 1C: 04180000 CSLINT 1D: 1E: 02006403 00000000 IBESR EBXWD1 1F: 00000000 EBXWD2 20: 21: 80569598 80328B08 IVASAV VIBASAV 22: 80328AF5 ESASAV 23: 24: 25: 80328B00 80328B00 84003004 CPC MSTATI 26: 00000F00 MSTAT2 27: 00060000 MDECC 28: 00000100 MERG 29: 2A: 2B: 00000003 0000007C 0000001F CSHCTL MEAR 2C: 2D: 2E: FFFFFFFF FFFFFFFF 80328AF5 FBXERR CSES PC 2F: 041F0004 PSL 30: 80329544 31: 32: 33: 34: 00000008 00000000 35: 36: 37: FLOW AND SNAP FILE DESCRIPTION ISASAV MEDR 00000000 - 00000000 00000075 00000116 00000000 00000000 00000000 38: 39: 3A: 000000OF 3B: 00000010 3C: FFFFFFFF 3D: 00000000 AAAAAAAA 3E: 3F: (KAF) 66666666 40: 00000000 FF: 00000000 - PAM =—-=- 00 00000000 00000000 00000000 20 03030303 00000000 04040404 02020202 1F1F1F1F 02020202 02020202 40 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1FlF1F1F 1F1F1F1F 60 80 1F1F1F1lF 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F AQ0 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1lF l1F1F1F1F 1F1F1F1F 1F1F1F1F 100 120 140 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1lF1F1F1F 1F1F1F1F 1F1F1F1F CO E0 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1FlF1F]1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1lF1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 160 180 1A0 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1lF 1F1F1F1F 1F1F1F1F 1F1F1F1lF 1F1F1F1F 1F1F1F1lF 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1FlF 1F1F1F1F l1F1F1F1F 1FlF1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F D-24 1F1F1F1F 1F1F1F1F 1F1lF1lF1F 02020202 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1Fl1F1F1lF 1F1F1F1F l1F1F1F1F 1F1F1F1lF VAX8600/8650 KEEP ALIVE FAIL (KAF) FLOW AND SNAP FILE DESCRIPTION -- PAM (Cont) -- 1F 1F1F1F1F 1F1F1F1lF 1CO0 1F1F1F1F 1F1F1FlF 1F1F1F1F 1FlF1FlF 1F1F1F1F 1F1F1F 1F 1F1F1F1F 1lFlF1lFlF 1F1F1F 1F 1F1F1F 1F 1FlFlF 1E0 1F1F1F1F 1F1F1F1F 1F1F1F1F 18 18181818 18181818 181818 18 181818 18 181818 18 200 18181818 18181818 181818 220 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1F1F1F 1F1lF1F1lF (Rest of PAM is all 1F) -- Interrupt Stack -- Longword Longword Longword Longword Longword 0 1 2 3 4 00000000 00000000 00000000 00000000 00000000 Longword - Longword Longword Longword 95 96 97 98 FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF -- SBI0 Longwords -- Longword Longword Longword Longword Longword Longword Longword Longword Longword Longword Longword Longword Longword Longword Longword Longword Longword Longword Longword Longword Longword Longword Longword Longword 0 1 2 3 4 5 6 7 8 9 A B C D E F 10 11 12 13 14 15 16 17 02800010 EE000000 1C000008 00000000 409C9BFC 0000000E 28007246 00000010 BO9E27EA 0000000E BOSE7864 0000000E 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 D-25 VAX8600/8650 KEEP ALIVE -—- SBI0 Longwords - Longword 18 Longword 19 1A 1B 1C 1D 1E 1F Longword Longword Longword Longword Longword Longword - Longword Longword FAIL (Cont) 00000000 00000002 1C000000 00000000 00000000 0802000E 040F0000 00000000 00000000 1C800000 Longword SNAP FILE DESCRIPTION TR1 00001000 00000000 Longword Longword FFFFFFFF Longword TR2 1C800000 Longword Longword 00001000 00000000 00000028 1C000000" 00001000 00000000 Longword Longword Longword Longword " 0w TR3 uwu Longword 59 Longword FFFFFFFF Longword 5A 5B Longword 5C 1C800000 00001000 00000000 -— FLOW AND -- FFFFFFFF Longword (KAF) SBIl1 = W -O Longword Longwords Longword Longword Longword Longword LU TR15 -- FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF "W Longword 59 Longword FFFFFFFF Longword 5A SB FFFFFFFF Longword FFFFFFFF 5C FFFFFFFF —= Dumping unknown record Longword 00004810 00000000 14800000 Longword Longword Longword 00180000 00060004 005F1481 Longword Longword nu ¥ o TR15 (3C) in longwords. This is record Longword 3F 40 reality clock the snapshot alignment and uPC trace. See Table 12 and 13. VSR doesn’'t know about it at the present time. i oW Longword Longword in for 08E20006 0018005F 00060004 D-26 VAX8600/8650 KEEP ALIVE FAIL Example 2 - This example (KAF) FLOW AND SNAP FILE DESCRIPTION illustrates what a typical snap file will 1look like when displayed on the console terminal via the "SHOW SNAP" command. The data is first translated into blocks (0 thru 9). hexadecimal and output by virtual The gibberish to the right is an ASCII translation of the data, to be used as a reference point if manual analysis is necessary. Every once in a while, mixed in with the gibberish, you will see a string of six ASCII characters that make sense. For example, on line number 090 of block 0, you will see "FBMCOl". This is the CDF file or ASCII record name described in Table 2 (byte position 04 thru 09). Since this is the last piece of information in the data header record, it can be used to identify the beginning of the data that corresponds to that record. 1In this case the data for the FBM SDB visability channel. Again using line 090 as an example, characters in the data field: 3130 you will see the following string of 434D 4246 When read right to left (standard PDP-11 convention), are the Hex equivalent of the ASCII string: these characters FBMCO1 The bytes to the right of this field correspond to the record 1length (004A) which equals 74 bytes decimal, the status code (01) which indicates that the record 1is valid, and the header 1ID (21) which indicates the FBA SDB visibility channel. See Table 2. This example was also designed to serve as an aid should manual snap file analysis become necessary. The individual records in the virtual blocks have been boxed in with a solid line. The headers for each record have also been boxed in. Thus, since the "SHOW SNAP" format is standard, if you need to analyze a snap file manually you can use this example as a map to locate the header and beginning of each record. 1In addition, the following EBox scratch pad register 1locations have been boxed in with a dotted line: ESC Location NG O A A A 0 Virtual Block # mmmmmmm Contents Reade R B BB R R R 17-2F 0CO 0D8 ODA 3 4 5 5 Possible Partial CSM.STATUS EHM.SP EHSR O0EO OE1l 0OE2 5 5 5 KSP ESP SSP OE3 OE4 5 5 USP ISP R D-27 R R R R R Y T R Machine Check Stack g ————— Frama VAX8600/8650 KEEP ALIVE FAIL (KAF) FLOW AND SNAP FILE DESCRIPTION NOTE - If the Error Handling Microcode (EHM) was in the proceés Check Stack Frame when the KAF of building a Machine stack frame in occurred then you might find a partial If the EHM built the stack frame, ESC Locations 17-2F. and called the VMS stack pushed it on the interrupt might find a partial Machine you Check Handler then 17-2F and a full Machine Check Stack Frame in in ESC stack frame on the Interrupt Stack (Virtual Block 6). SNAP1.DAT block 0. 41CB 1B05[132A 5845 01D0 0000 0205 024A 030E 0349 0000 0000 0902 20AA 0052 A880 2E42 020A 0100 6110 4246|004A|Q1PR1]| FFFF FFFF 09D6é OBF3 OFF3 OFOC OE9D OE33 039F 038F 030F 0000 0000 0000 0516 OF03 030F 0000 OE96 OE0O5 0707 0000 0228[3130]434D 0B15 0408 0168 0983 0000 052B 4D56 0004 0100 0000 4810 004A 4225 0100 0000 0709 0120 1EAC 0100 0000 8584 OD9E (3130 4241 OE4F OFB6 0D87 030A 0D4D OD1D 0000 06Cl1 03C7 42461004A 1012210000 0000 0818 0828 0618 0028 0418 0941 0B55 0AS51 0147 0AC7 04EA 018B 00EC 048A 0429 0010 0943 042E 01j23} 0000 0000 0000 0000 0000 0000 0000 8000 0010 2880 29203430 4444 434D[002A 0000 0000 0000 0000 0004 C204 C1CA 4C8C 4644 4249]/002A[ 012410000 0000 C 0000f 0 000 w@Al]3530 E166 D171 4321 0003 040D 2082 6240 8720 0214 A99C D4D6 CD60 CFAOQ 4545 6C74 CFE6]3230 4650 4449]002AIQ1 2 030D 8EF1 AFC3 FEF1 EF67 EEEF EC2F E18D 002A| 01R6 | F7F7 E7FD FBF1 1F5A E CD2B CEBE C089 34E8 B4F2 133C| 3230 4349 AQ00 24F0 8080 4223 8E32 99BO 1B1A 3130 4642 4349]002A] 01R7]F060 0 20A0 1791 4F45 7292 15A2 B8CC 1D37 0000 0000 0000 0000 88CC 4343 6363|3130 0000 0000 0000 0000 4445/ 0022101290000 0000 0000 CODO CO070 C4A0 2028 0000 0000 0000 1 0000 Example 2 0000 SHOW SNAP 4049 210D[3230 0000 0000 Command - Sample D-28 [uaJ'VMB.Ex*nunwA] [QUQBOUDQItiOJOi.] lotbon-‘tboutootl] [onaa.u'autRouu'.] lovto-HoovoongEB] [&BOlnnonon«oonuu} [‘QUQOQG#3.‘G‘QO‘ ] [ QQMQ‘.Q..O...O‘U ] [ ] & & 0 0 0 85 & " 8 e P H SN [0‘.‘ +J.FBMCO1 (. ] [ ('Qfl(‘.‘l‘“’.] [ ttG Q UuA Caath ] [)ucynwautaon+cnu ] [O‘O..‘O..“U.'#fi ] [*wMCDDO4u)¢(00Q0 ] [oLunal‘onootuohu ] [eeeeeceee$.*. IBDF] {050&00u0u0¢cq g l [EE0¢ .wuaawoc«@b] [..%.*.IDPF02..t1] [auuu/cqfigtwecuatl [+0b0205uo.0«&§*o] [ICAH02<OOU‘4QQOO] [awoqnan#Boalsnfl] [+.0.7."'.*.ICBF01] [0¢.;7annqur800w] [.onbnmwutuctaavol [(.*.CLKEQlccCC..] t«u.u.anuuonoo.o] [Ql.i."“.) *QED} [PCOZUUI@('.CP..O] [@.“Q.OQCOCQ.QCC] Printout (Virtual Block 000 010 020 Master Header 030 040 050 060 - 070 080 090 OAO 0BO 0co ODO0 FBA FBM OEO OFO0 100 110 120 130 140 150 MCD 160 170 180 190 1A0 1BO 1CO -1D0 1E0 1F0 ICA 0) IBD IDP ICB CLK EDP VAX8600/8650 KEEP ALIVE FAIL (KAF) FLOW AND SNAP FILE DESCRIPTION SNAP1l.DAT block 1. 4200|3230 2245 4245]002a]012A{0000 0000 0000 0000 0400 0000 0001 4404 09CO A012 01@?!0000 0000 0000 0000 0000 0000 0000 14921 4000 4111 BAAO [....*.*.EBEBOQ.B] 1E05 1E26[3130 4B43 434D|002A 4A00 0083 024C 000C 0214 0202 C{ 0000 0000 0000 0000 4450 414D J002A]01j2 008A 1006 2622[3230 0 4003 0508 41C1 0002 0000 0002 C100 0000 0000 0000 | 43E1 70A4 F113 70C2 7571 0002 0100 FO68 0000 0000 0101 002A] 012E 10000 0D60] 3530 4343 1 4844 0A00 281, 0000 0000 0000 0000 0000 070E OEOO 13230 0804 0000 0206 0000 4242 001D 0000 0409 0000 53431002AIO;EE10000 0000 0912 090D 1010 0000 4241 5: AB01 0000 130 ; 0000 | 0085 0483 FFOO OO3F) 85BE 8449 2000 0000 CFAQ 0000 0000 0000 0040 1EAC Example 2 SHOW SNAP Command - 000 010 020 **’] ['0‘0'0‘0‘&0.‘ 030 Il u»e KOI&wG [*«MCC 040 [Q”‘H‘QLQOOQJQflUA] [00&0*!0"0 ‘MAPD] 050 060 [02"&cee.@..A...0] 070 [‘0»00”0‘0#"0"1 080 [..-«*.EBDDO02,.p.C] 090 {oqu«pafluuhuuawn} OAQ 0BO (..] [EBCCO5~.1.DH. 0CoO !3‘&&1&'*0‘&#&*0. [~¢a¢«¢/«*mCSB§Q21w 0DO OEO [Hnwnfi«nnouoouvu' OF0 [QUOQ‘U“'.O'O‘UU} 100 {OU*OCSMQZQouuaa] 110 [0‘\.“‘00““0’0.‘ 120 [.a»&anuuwmlw * .MT] 130 [MBOIuw.(D&uu&oua] 140 [6&»&»0.00&.«&»&0] 150 C,.]) [¢eee2ee.CSLrE 160 fiuuhl [woouuu(aoflbv 170 [¢e3.2.EMMrec?...] 180 [n@annmwmoIqutu] 190 {oawwonfitauuttwwal 1A0 [e.%.4...IPRrece@.] 1BO [uuwuc&omtuwtonlql 1CO [Vaotuu«nwbutunwo) 1D0 au] [n«wulquvmtapu 1E0 [lmwmuuutmuuaauwt 1F0 [uuwotijuugwnfiuu] Sample Printout (Virtual Block 1) D-29 EBE [“fl‘flDU.h‘flfiQ"‘ MCC EBD EBC CSB CSA CSL EMM IPR VAX8600/8650 SNAP1.DAT 0000 0000 0003 2600 KEEP ALIVE block FAIL mmmmmm mmmmm mm baaac 0000 0000 FFFF - 0000 0004 0008 0o0ocC OOFF 0003 2 SHOW SNAP OFCO 0001 FO097 0000 0000 0100 0004 0000 0000 0000 4404 040C 8B08 8056 0004 8032 0003 0006 5345 0000 041F 040A 0000 0000 0000 0000 0000 0000 8056 956C 0000 400C 00580000 0200 mmmmmm Example FLOW AND SNAP FILE DESCRIPTION 2. 0001 0000 8001 0000 0000 2600 001F 0483 0000 0000 0000 O0F8 OFF9 0006 00DD 0100 00C4 0018 0060 0000 0116 0000 0004 0000 0000 0000 9598 FFFF FFO1 0000 0000 041F 8B0O0 8032 8B00 8032 8AF5 8032 007C FFFF 0000 0202 8400 0004 0000 FFFF FFFF 0000 Q000 0004 0000 0135] 8056 0000 0000 0000[6365 7243 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 9594 0000 0000 0000 0000 0000 0000 PR (KAF) AR T 0020 2000 0000 0000 0000 8B00 0000 001F 0000 8032 0000 0000 8AF5 OFO00 007C T T e A u— 0000 0000 0000 0008 0116 000F AAAA AAAA 0000 0002 0000 0006 0000 000A 0000 O00E 0000 0080 | [‘fi'fi.“flfil“‘i.‘ ] [o'oottncw&otu&UQ] [00.'.0'0.**“‘.0 ] [Ub“i“..iifliifiil [0nu.oooau'htawu‘] [ouauno-tbuuoqnnl [VOGUZ0.0ZOOQZO.&} [20¢qvhctuattaa'«] [0“0..0"0‘0“.0] [00&.;00.001¢V05¢] [.ESCrecCecesssccee] [‘.ifl..l".’fl.fl.b][unuunto»qcunooau] lqntwdtuuuuVOG&Gu} [10V0u«.ootouuno¢] [ Q@'tudqfiacotiwuwl ) [ -n.ax.woaou 000'] [ G....‘QDU'..‘dOU] ' [ wawowuuuccVo&oZu] 000 010 020 030 040 050 060 070 080 090 0AQ 100 110 120 {00200‘200'2‘*000] 130 [0*‘*0..0"0..‘00} 140 [ Ql.bdti'ii&i"'] 150 [ *a2uooooD020wo¢b] 160 [ ctu.uooou;nUUvuyl 170 [ Oifiltibtidlldbitl 180 [ ¢¢0¢0¢¢00¢00¢!"] 190 [oqfluffffwtomouuv] 1A0 [u«nuanuutudvomaa] 1BO [tl‘il.&*'lhb..ll] 1CO [Qt&dtflvu.dtuuvual 1D0 [*fi.‘.‘ifl"""..] 1EOQ [00!00.?0.&&@."@]‘ 1F0 Command - Sample Printout ESC OBO 0CO 0DO OEO OFO0 MCK Stack Frame (Virtual Block 2) i, D-30 0 ALIVE FAIL (KAF) FLOW AND SNAP FILE DESCRIPTION KEEP VAX8600/865 SNAP1.DAT block 0000 0000 10000 0000 OFOF 0000 0000 3030 OOFF 3020 0000 0000 0000 0000 0000 0000 0800 0000 8000 43DE 2002 “FFFF 0000 0000 0000 0000 0056 001D 0400 O00AO0 0000 0000 0000 0000 OFOF 0000 0013 FFFF 0060 2008 3030 0300 FFFF 0000 FFOO FCIF 0420 0000 OO3F 0400 O7FF CO000 3C00 0001 0021 0000 O001F OOOF 0000 0000 0090 0000 FEOO 0000 1D01 FOF0 3. 0016 7FFF OOFO0 002D OO1lF 0000 0000 0000 0000 0000 2000 0000 0000 COOO 0000 0000 0000 0000 0000 0000 0000 0000 0000 7FF0 0040 002B 3EC1 8000 0000 0000 0000 0000 OOFF FFFF 0000 0000 0000 0000 0070 FFF0 0080 0000 OOEF FFFF 0035 0000 0000 0000 3F80 0000 0033 0047 OOEO 3E8F 4180 0000 0000 0000 0000 0000 3FC3 0000 0000 0000 0000 0000 0050 OFFF 0000 OO7F 2000 OlFF FFFO 0041 FFOO 0038 0100 0085 001C 2020 0030 C051 C200 407E 3FFF 0000 0000 0000 00C5 0000 0000 0000 0000 FFFF 0000 0000 027F 0000 FFO0l1 041F 0080 0000 0000 0000 FEO02 2600 0000'0000 0004 0007 0000 0000 0000 0000 6666 [.@cceeeeceecsVeee] [eccccecccccececses] [ecee@ececccceess] [eoeetece=ceceess] 004A| [JeeePecePececececs] OlFF| [cccecsescescssse) 0023 | [#eeecevceeces eae] [eecececsessas0000] OOAl| 3000| [«0cccceecececces] 0015 [unncunouom&omuwfl] FFEO| [eecceeee5ececeeces] 4020| [.@ceAceceeeeec?.e.] FFOO| [esceecccccececeeces] 004D| [MeeeBeeeoeeee<..] 0380 [seeececse?eceecses] 0055| [Uceccoccocesececes] 0000| [ecececece3eeecees] 2D20| [e=¢ceeeeGeeseees] 000 010 020 030 040 050 060 070 080 090 OAD OBO 0CO ODO OEO OF0 100 110 0003 3003| [.0..Q.ff.>.....C] 130 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 FFFF 0000 2000 0000 0000 0000 03CO 0000 0400| 7F80| 0020| 0017| 0000 OOBO| [eceeOcccececcese) 8000 0000 0000 0000 0000 0000 0000| 4080 0062| 0000| 00O0| 0000| [ececeece?eAece>.e] {{o@n»u@uunw»uauw] [Decec?ecccecceses] [eccececcescececees] [ecececccccescesses] [ececceccccccecess] 0000 0000 0000 0000 0000 0201 0006| 0000| 0000| 0000 [¢ececeeecccsssss] [ececceeccecceseces] [eececescecsececees] [ecceeee&ecacesss] louutvnaonmuumuunl 8056 956C (0000,0005,0000 0000 0000 0000| 0000 0000 {0000 0000 0000 [ceceveeeesssleVe] 120 140 150 160 170 180 190 1A0 CSM.STS 1BO 1CO 1DO 1EO IFO ‘ Example 2 SHOW SNAP Command - Sample Printout (Virtual Block 3) D-31 VAX8600/8650 KEEP ALIVE SNAP1.DAT block FAIL (KAF) FLOW AND SNAP FILE DESCRIPTION 4. 7FFA 1840L§056 956CJ0000 0000 0000) 0000| 0000 0000 0000 8001 0000 0000 0018 0060 EHM.SP EHSR ' ' [QOQttowvgnVQuopt] 030 StaCk Yy [80000C40] 0483 F097 0000 0058 0000 0002 0000 8056 8056 10000 0000 4000 0005 95F8 95F0 0000 80000000 027D C000 0000 0000 0020 2000 0000 0003 0000 0000 0000 0000 0080 0000 0000 0000 0000 0000 0000 0058 0000 0000 027D 7F80 0000 0000 4000 0000 9400| 2000 0004 | OOO0O0| 1800| 0000] [eeleececceceseees] [au&atuunuu}au@at] [eceseceecescecees)l [eeececesseessaVe]l [eeeBXeeeeeeeaaVe] [eesseeesssseeeesl 724D 4150/ 040A(0136]10000 0000 0000 0000]| [eceeoeeeeb.s 0000 0000 0000 0000 0000 0202 0202 0202 0202 0202 0000 0202 0000|6365| 0202 0000| [€Ceceesccecseaeceses] [eeeseessssacssss] [X.n.voveo@uuvvar]l .PAMr] 020 040 . Pointers 050 | 060 070 080 090 | OAO0 OBO 0CO |1F1F 1F1F 1F1F 0404 0404 0303 0303 0202 [nuoontcoaubbtoual 0DO 1F1F 1F1F 1FlF 1F1F 1F1F 1F1lF 1F1lF 1lF1F| [eccesceccscecsseeses] OEO 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1lF 1F1F 1F1lF 1F1F 1F1F 1F1F 1F1F 1F1F 1FlF 1F1F 1F1F 1F1F 1F1lF 1F1F 1F1F 1F1F 1lF1lF 1FlF 1F1F 1F1F 1F1lF 1F1F 1F1F 1F1F 1lF1lF 1lF1F 1F1lF 1F1lF 1F1lF 1F1F 1F1F 1F1lF 1F1lF 1F1F 1F1lF 1F1lF 1F1lF 1F1F 1F1lF 1F1lF 1F1F 1F1lF 1F1F 1F1lF 1F1F 1F1F 1F1F 1F1lF 1F1F 1FlF| lFlF 1lF1F| 1F1F 1lFIF| 1F1F| 1F1F| OFO 1F1F 1F1F lFlF 1F1F 1F1F lFlF 1F1F 1F1F lFlF 1F1F 1F1F lFlF 1F1F 1F1F 1F1F 1F1F 1F1F lFlF 1F1F 1F1F 1F1F 1F1F| 1F1F| 1F1F [ntntutow.tct.uno] [ecococcccscccsccsl [Qtutyoontnthott‘} [ecccocooccosccscsl [fit.d».t.wvuwotub] [ececccsccoccacacs ] [ecceececccccsncos ] [ececccccccccconcss ] [ecceeeccccccceecsee] [¢eceeccececseess] {oouanwuuuwutunu.] 170 180 190 1F1F lFlF IFIF lFlF 1F1F lFlF lFlF lFlF lFlF lFlF 1F1F lFlF 1F1F,1F1F lFlF lFlF lFlF lFlF lFlF 1F1F lFlF lFlF lFlF lFlF lFIF lFlF lFlF lFlF lFlF 1F1F IFIF lFlF [iuaaun.oo.uuoa..] [tbaou'nuuvoatuiu] [awuwantutu&ttout] [uutu»mtanaou.&.n] IAO lDO 1F1F 1F1F 1F1F 1F1lF 1F1F 1F1lF 1F1F| [eccoccccceecesss] 1EO Example 2 1F1F SHOW SNAP Command - Sample D-32 - 000 010 0000 0000 '6270 CFAO 8056 9600 0000 Ooooréggpmpgpg. | [eeoseeseleV.@...] [mfilibfiitfibifiiiflfil Printout (Virtual Block Y PAMM . | . : 130 1C0 4) —~ ‘ T, ‘ T N wwwwwwwwwwwwwwwww KEEP ALIVE FAIL (KAF) FLOW AND SNAP FILE DESCRIPTION VAX8600/8650 SNAP1.DAT block 1F1F 1F1F 1F1F 1F1F {1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1818 1818 5. 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F l1F1F 1F1F 1F1F 1818 1818 1818 1818 1818 1818 1818 1818 1818 1818 1818 1818 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1818 1F1F 1F1F 1F1F 1F IF 1F1F 1F1F 1F1F|1818 1F1F 1F1F 1F1F 1F1F 1F1lF 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F |1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F Example 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F l1F1F [fiou«ut&»&umouuaatl [tuuuuuouoonuuwau] [ooaowuooowu»ouau] [fivnautwuuwonuudu] [fibi"flfi“‘dfib‘flfi] [«quaauwudnnvumnu] (uwuwuwauauuun.au] TR R R EE R N N [auuovtwuaunaouuo [0&40‘00004‘0"0*:0] [ombotmanouwaumuo] [tuommonauofiwaa«a] [Quuuvounoouatmum] [t'l‘.i&.‘flfl‘ib‘fi] [waauuuu*auoounou] [«outt«uatwiuanuv] [nuuauycmwuuanku] [qonooooa.auuunnn] [umuaaoaw;.untoou] [avnouw&uawanthtul [mfiuauoawmuntauwo] [nouatnaaioawbnual {Q'iflfl'itfl“‘b"h] 00.00"*.‘0‘.“’] 2 SHOW SNAP ‘Command - Sample Printout D-33 E [uw»uunouan«uuu»a] [uuacwoqw&qaumonu} [uouauanwwauunovb] [o»uaot»tufiwauwnu] [wouuuunuutou.pnol [wfln_wwflcunnouonuuun] [:@owr%. «y‘_;sovuouuwwu»w] [aauucwwn»nuuwauv} 000 010 020 030 040 050 060 070 080 090 0BO 0CoO 0DO OEO OF0 100 110 120 130 140 150 160 170 180 190 1A0 1BO 1CO 1D0 1E0 1FO (Virtual Block 5) SBIAO ADDRESS VAX8600/8650 KEEP ALIVE FAIL (KAF) FLOW AND SNAP FILE SNAP1.DAT block 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 6. 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F l1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1iF 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F l1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F "1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F 1F1F l1F1F l1F1F 1F1F 7250 5349/ 026E[01]37] 1F1F 0000 0000 0000 0000 0000 0000 0000 0000 0000 1800 0000 0000 0003 0000 0000 0000 0020 2000 A0CC 0000 0000 8000 8B1F A0CS 8000 AO0CS8 8000 8B1F 0100 0006 0000 0000 0000 FFFF 0000 001F 0000 007C 8B1F 0418 0000 8000 AOCS8| 1F1F 1F1F 1F1F 0000 0000|6365 0000 0000 0000 8056 95F0 0000 8056 95F8 4000 0200 6003 0438 8000 AOD4 8000 8400 6004 8000 0000 0003 0000 FFFF FFFF FFFF 0000 0000 FFFF FFFF 8000 FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF Example 2 SHOW SNAP Command - Sample D-34 DESCRIPTION [000.&;&.0;0&0&00'] [aouuuv&ann*atuu.] [0‘00.00.00'.0000] [0&00"""..0'000] [100.5*..0.00'.0'] [a&na.au'n.iauuu'l [.000‘0006."0.00] [.ttuu.#fivtooct&.] [ocntvtnonuunu.ut] ['i*.bb.t.’iiii‘b] [00.‘00““0!.'.‘] Jee7.n.ISPrece...] [000‘00"&}0"“’] [‘C‘OOGG#OVCUOCOVC] [Q‘MU .x.OUUOQ@‘.V.] [ * B S 2 B8 B 2 9 9 W 8ee [ I ) "ol [ I ] [Qflibfi.fi"‘.'.i.fl] [ Teel [ UQl.Q&#QOQOO*'*.] [ 0!..00'.&*'0'.0] | ..00"‘0"0.'*"] [ 0.0.‘00!."’..'.] [ ] e B 9 ® LA B BN BN 2 R 8" "8 B O R BN R B N O N % @ IR [onptu-uo-nna'wni‘] [;.uu&o'uouwuonoo] [ouunooauouuot»un} [cntdno&uuoutcnli] [00&"’0"‘0!!.!.] [&00*!00‘0!0'0.!0] [0.;00&\'&00:000&*] [tuwoouun.vuo;tou] [a&.nounwwuuuncuo] Printout (Virtual 000 010 020 030 040 050 060 070 080 090 0AOQ 0BO 0Co 0DO OEO OFO0 100 110 120 130 140 150 160 170 180 190 1A0 1BO 1CO 1DO 1EO 1F0 Bloc6) k ISP VAX8600/8650 KEEP ALIVE SNAP1.DAT | FFFF FFFF block (KAF) FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF 0280 409C BO9E 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0010| 6365 FFFF 1C00 0000 2800 BOSE 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 1C00 0000 0000 040F 0000 1000 1C80 0000 1000 1C80 0000 0000 1C00 0000 1000 1C80 0000 0000 1C00 0008 "EEOO 7246 7864 0000 0000 0000 0000 0000 0000 0802 0000 000E 000E 0000 0000 0000 0002 000E FFFF FFFF 0028 FFFF 002A FFFF Example 2 FFFF FFFF 9BFC 27EA 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 FLOW AND SNAP FILE DESCRIPTION 7. FFFF FFFF FFFF | FFFF FFFF FFFF FFFF FAIL FFFF FFFF 0000 7230 0000 0010 O00E FFFF FFFF FFFF FFFF FFFF 4253 017E[01B8 SHOW SNAP Command 0000 0000 0000 0000 0000 0000 FFFF 0000 FFFF 0000 0000 0000 FFFF 0000 0000] - Sample D-35 [.fl“.i"'lfi"""] [oaafiobwuuuaunuan] [.000.10;'00."0'10] [-*oootiwnawomwoml [uu.uoatttutntuwnl {»00000&&»011»&*0] [Ofllfi.."bdl“fi'b} [&uoauunt.qwcmunwl [‘000‘0000*‘#.'0‘] Q'OUOQO&*UGOUOO‘] [00"0&‘0&#‘#*‘00] [cwouuutuwouuwfluao] [Iflfld'fl‘i"'!"‘l'“] [.fi*#l‘.’fl&.'&fifl‘} {000*00000.0&0000] [OO'Q0.0&.“.O"*] [t»ufiooethuwcttvl [0‘0*“0.0.0“'0%} [8e ¢ eSBOreCeceeeesol {*0‘00.'0000‘0@*0] [qura(a.uoo'wwwo] [&.dxCQDMOOOOIOOfi] [u*«wuoa.ufiaubuou] [tunflnooc&omuutun] [onbtu&uw*o»nttua] [ttmuuu:rww«anunw] Icooutuunaanwwuuul [iu&camauo»wwuan»] [owuuu«o.uowooo»o] [00&0**».«*00&&0«] [cnwmnaooawnuou*ml [atpwuu:.noututat] Printout (Virtual Block 000 010 020 030 040 050 060 070 080 090 0AO OBO 0CO 0DO OEO OFO0 100 110 120 130 140 150 - 160 170 180 190 1A0 1BO 1CO 1D0 1E0 1FO0 7) SBIAQ VAX8600/8650 KEEP ALIVE FAIL (KAF) FLOW AND SNAP FILE block 8. 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 FFFF 0000 0000 0000 0038 0000 0000 FFFF 0000 FF]39] 0000 0000 0000 FFFF FFFF FFFF 0000 1000 0000 1000 0000 1000 0000 0000 0000 1000 FFFF [6365 7231 4253|017E FFFF FFFF FFFF FFFF FFFF SNAP1.DAT FFFF FFFF FFFF FFFF FFFF FFFF 1000 1000 1000 1000 1000 FFFF 0000 0000 0000 0000 0000 FFFF 1C80 0000 1C80 0000 1C80 0000 1C00 0000 1C80 0000 FFFF 1C80 1C80 1C80 1C80 1C80 FFFF FFFF FFFF FFFF FFFF FFFF FFFF 0018 FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF |FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF Example 2 SHOW SNAP Command - FFFF FFFF DESCRIPTION [n-mtuoutawuuuauc] [-uobnou-umntoonal [uo.auouuunnutau&] vunaouuumuwuatou] [a.u.*oncaaoat.n&] [ctvonuotwouquumo] [0»*;&&*.00&0..0&] [.............«8.] [ontwaon:anev»ua»} [QO‘O&*UOQ‘.OC'QO] [+.SBlreCscecsccecs] [oauonoootv.uofiuq] [ow»uoou-nonuattu] laoct»uutuofitoauu] [uwuvu‘ubfinuuol-o] [u»utuanaquwouo&w} [uunpnfialnuo&baot] [qonnfiuottu.u»tuu} [nwunac«.nuore-uu] [Q.Q&tfi.it&.«t.&t] [»tbwnnuonuuuttme] [0~¢0~00¢0.0tn00¢} [0600&0*&00“'*,00] [.0«00»&00&#««.0»] [uuqotuutwtiotnui'] [uou&tv&».v»nu.wub] [0:0&::»&00&»00.0] {ouo-tao‘u»nuvnoo] [auuuuuur.ooaufiuu] [00001»&000.0;0.&*} [outu&»un.nttuato] [ootfi.&oocoanoowfw] 000 010 020 030 040 050 060 070 080 090 O0AOQ 0BO 0CO 0DO OEO OF0 100 110 120 130 140 15 160 170 180 190 1A0 1BO 1CO 1DO 1EOQ 1F0 Sample Printout (Virtual Block 8) -D-36 SBIAl VAX8600/8650 KEEP ALIVE FAIL (KAF) FLOW AND SNAP FILE DESCRIPTION SNAP1.DAT block FFFF FFFF 9. FFFF FFFF FFFF 010E[{013C FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF 0000 0000 0000 0000 48106365 7243 5055 0018 005F 1481 0006 0004 0018 005F 1480 08E2 0006 0004 0018 00S5F 1482 0006 0004 0004 0018 O05F 1488 0006 0004 0018 O005F 005F 0D27 0006 0004 0018 005F OE67 0006 0006 0004 0018 005F 08D8 0006 0004 0018 0007 005F 1478 0006 0004 0018 00SF 04C2 147C 0006 0004 00D8 005F 147A 0006 0004 0004 0064 005F 1480 0006 0004 004A 005F 005F 1482 0006 0004 0018 005F 1481 0006 0006 0004 0017 O05F 08DA 0006 0004 0018 00A8 00SF 04C2 0006 0004 00EC OO0S5F 08EOQ 147A 0006 0004 00F3 005F 1478 0006 0004 0004 OOAF 00SF 147C 0006 0004 00AD 005F 005F 1481 0006 0004 009D 005F 1480 0006 0006 0004 0018 005F 1482 0006 0004 0018 0000 0000 0000 0006 0004 0018 005F 08E2 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 10000 0000 0000 Example 2 0000 0000 0000 0000 0000 0000 0000 0000 SHOW SNAP Command - 0000 0000 0000 0000 000 [CU“.“'..O"‘U'] 010 [fl‘fl""‘.*.'(‘*‘] 020 [UPCreCeHeeceesossl 030 UQU] [fifl ® % @ @ & & 9 & @ 040 [«fitfiaa woccw»a*o] 050 .ouon] [ e 00 890088 060 u] ® 00 e e o o [«ag« 070 “] O‘.‘b @ B & & % & % & ‘O‘OD‘QXO l\‘.] - 080 A— R A——— - SR A Aa—— AR c} 'R QJOJOO‘C* ‘dflbl} leflav ¢ 8 &0 e 0 0@ t] [000»00.0 ouuowno] coel [.. —— ” A i d—-— - A ® 9 & 8 0 B 8 & & [Q‘T'R‘m. LR O‘TUZ‘] [ & 9 8 & & @ L G""] [&0&& A EEEEN N 0] [o«w&tunu utauuco] [ou uva.ocootouau] [Quanuuuauuuauuou] [Q.uuooububaouu«vo] [fiqfinuauutwuoounu] [ocuooafluuooutuuu] [000'&.'!'0'0‘0"1 [nquuqmuanwnuu.ntl [tuabounotuuvualol [ttfl‘fitb""fi‘!b!] [wbvotwouuannvuuul [0'00000"0‘0"0&] {nucutouow»unu&un] qutoo&wnnouo»uuu] [‘Q‘ll'i"*ib""] A, ao—— A A A W 090 OAO 0BO 0CO 0oDO0 OEO OF0 100 110 120 130 140 150 160 170 180 190 1A0 1BO 1CO 1D0 1E0 1F0 9)k Sample Printout (Virtual Bloc D-37 uPC Trace VAX8600/8650 KEEP ALIVE FAIL (KAF) Key FLOW AND SDB SNAP FILE Signals The following is a list of key SDB signals that are the SDB signals captured by the Console and put in list is used by VSABLD (Venus snap file analysis) contents of Although the can be very includes: Snap files list is useful up-line the a remote to state aid of a used a to automated hung 2. A Note column that provides some information c = constant (normal or console stimulus) g = goodstate (no error conditions) t = testable (dependent on system state) 3. The expected 4. The are Signal Name as identifed with Because these to at look state of are first SDB 12xxx the signal it appears astrisk the signals in () been 14 Key SDB g 12xxx g 12xxx 24xxx t t State Signal Name 0 0 you 0 0 ABUS CPU BUF ERROR H MEMORY LOCK ARRAY RD BUSY H ARRAY RD BUSY H H 15xxx g 0 15xxx 12xxx 12xxx 15xxx léxxx C C C c C 0 1 1 0 0 CL CL CL CL CL TxXxx 7XxXX C g 7xxx 7XxxX TXxXX 0 1 g Cc C 1 0 1 CLC3 MARK STOP COND H CLC4 CPU PHS 1 NXT H CLC4 CLK2 CLK3 CPU PHS 2 NXT H START H | CPU CLK STOP H 7TXxXx TxXXX TXXX Txxx C C C C 0 0 Txxx C CLK3 CLK3 CLK3 CLK3 EN EBOX MARK BIT H EN FIELD STOP H | EN IBOX MARK BIT H EN MBOX MARK BIT H 0 0 0 CL CPU PF INTR H MASTER RESET SDB CNTRL S1 SDB CNTRL S2 UNHANG RESET UNHANG RESET CLK3 FORCE MARK D-38 will extracted Signals ABUS the signal the print set. Signals that indicate an error condition. that have about (given a good machine) Expected Note it The list Number an thay analysis machine. The SDB Visability Channel to analyze File. The evaluate the Snap host. l. VC# * to intended analyzing Table * loaded primarily in DESCRIPTION H A H A H H H BIT H most and put likely in want Table 15 VAX8600/8650 KEEP ALIVE FAIL (KAF) FLOW AND SNAP FILE DESCRIPTION Table 14 SDB VC# C Cc C C C 0 0 0 1 0 TXXX TXXX C g 0 1 lexxx c 0 Signal Name CLK3 CLK3 CLK3 CLK9 CLK9 SET VCO FREQ H STOP FBOX IN TO H STOP FBOX IN Tl H FBOX T3 CLK H STOP F IN PHS O H CLK9 STOP F IN PHS 1 H CLK9 WBUS T3 CLK H 0 CLKB VCO INIT H CSA CSB CLK DATA CHNL H C 0 CSA EBOX UMARK H 16xxx 17xxx l4xxx 14xxx 17xxx g C o) C C 0 0 0 0 0 CSA CSB CSB CSB CSB 17xxx 17xxx 14xxx 17xxx C g g o C 0 0 0 0 0 CSB FCT SEL 3A H CSB HOLD ERR LTH A H CSB USTK PAR ERR H CSB WRITE PULSE A H CSBR CNSL OPl1l FLAG H 16xxx 16xxx C C 0 0 CSBR CNSL OP2 FLAG H CSBR LOAD DIAG CNTR H 14xxx 14xxx C C 0 0 EBC EBD MAST RST DLY H EBC EBD UNHG RST DLY H 1lxxx C 0 0 0 0 0 EBC EBC EBC EBC 5xxx 14xxx 11xxx 12xxx 12xxx C C g C C 0 0 0 0 0 EBC IBOX MASTER RST H EBC INSERT DIAG ERR H 15xxx 15xxx 15xxx 15xxx g g g g 0 0 0 0 EBCG EBCG EBCG EBCG 15xxx 15xxx 4xxx 1lxxx 11xxx C g g g g 0 0 0 0 0 EBD EBD EBD EBD EBD * 11xxx 11xxx 1lxxx 1lxxx 11xxx 15xxx * * * * Cc ~ (Cont) 7XxXX 16xxx * Expected State TxxXx TxXxX TxXxx TXXX TXXX TxXXX * Note Key SDB Signals g C C o C C 0 0 PAR ERR H CLK DATA CHNL H CNSL OP H CSPE RESET A H CSPE RESET B H EBC CPU PF INTR LVL3 H EBC EBE MAST RST DLY H FLIP FLIP FLIP FLIP WBUS WBUS WBUS WBUS PAR PAR PAR PAR BO Bl B2 B3 H H H H EBC MBOX INTR LVL3 H EBC MBOX MASTER RST H EBC MBOX UNHANG RST H CLK6 CLK6 CLK6 CLK6 PHASE PHASE PHASE PHASE TOA TOB TOC TOD H H H H EBCH FLIP MCF RAM PAR H DIAG RST IBOX H EBOX ERR LST CYC H EBOX ERR TO IDP H ECS PE FLAG H ECS PE LST CYC H D-39 VAX8600/8650 KEEP ALIVE FAIL Table SDB * * * * * * * * * * * * * * * * R B 14 FLOW AND SNAP Key SDB Signals FILE DESCRIPTION (Cont) Expected VC# D (KAF) Note A R S D S 15xxx 11xxx llxxx 11xxx 1lxxx S State b B R g g g g g 0 0 0 0 0 11xxx g 0 ldxxx 14xxx 14xxx 14xxx g g g g -0 0 5xxx BXXX 4xxx 10xxx 11lxxx g g g g g 0 0 0 0 0 1lxxx g 0 1lxxx 11xxx g g 0 0 14xxx 0 0 g 0 R Signal R Name mmmmmwmmmMmmmwwm mmwmmmmmm“mmumm EBD EBD EBD EBD EBD EDP PE FLAG A H EDP PE FLAG H EMCR PE FLAG H MBOX FE FLAG H USTK PE FLAG H EBD WBUS EBDF EBDF CLKé CLK6 EBDF CLK6é EBDF CLK6 PE FLAG H PHASE TOA PHASE TOB PHASE TOC PHASE TOD H H H H EBE IBOX ERR LTH A H EBE IBOX ERR LTH B H EBE IBOX ERR LTH C H - EBE IBOX ERR LTH D H EBEG CLK6 PHASE TOA H EBEG CLK6 EBEG CLK6é EBEG CLK6 PHASE TOB H PHASE TOC H PHASE TOD H EDP OPR PAR ERR H 10xxx g 0 10xxx 10xxx g g 10xxx 10xxx 0 0 10xxx g C o) 0 0 0 EDPI DISA BYTE 10xxx 10xxx o) c 0 0 EDPI EDPI DISA BYTE 32 PAR H DISA SCE AR BUS H FLIP WREG PAR H EDPG CLK6 PHASE TOA H EDPG CLK6é PHASE TOB H EDPG CLK6 PHASE TOD H EDPH PHASE TOC DLY H EDPI DISA ALU AR BUS H 10 PAR H 10xxx C 0 EDPI XXX g 0 FAl4 CLK8 XXX T13 g DLY B 0 FAl4 CLK8 T13 DLY PW 14xxx 12xxx 10xxx XXX XXX g g g o) g 0 0 0 0 0 FBA FBA FBA FBM FBM XXX g 1xxx 0 g 1xxx 0 g FM13 0 FM13 * * 6XXX 6xxx g g 5xxx * * * 3xxx 3xxx 3xxx g 3xxx g g g g 0 0 0 0 0 0 0 H B H FBOX PROBLEM B H FBOX WRITE PROB H FWBUS ABORT H CLK 141 RESET H CS PAR ERROR H FBM FDRAM PAR ERROR H CLK8 T13 CLK8 T13 DLY B H DLY PW B H IBD BUF DRAM PE H IBD BUF IBUF PE H IBD DECODE ERROR H IBD4 DRAM PE H IBD5 IBUF PE H IBDA ERROR SAV H IBDF CLK6 PHASE D-40 TOA H VAX8600/8650 KEEP ALIVE FAIL (KAF) FLOW AND SNAP FILE DESCRIPTION Table 14 SDB VC# Note Expected State Key SDB Signals Signal Name C C 0 0 0 0 0 IBDF CLK6 PHASE TOB H IBDF CLK6 PHASE TOC H IBDF CLK6 PHASE TOD H C C 0 0 ICA FORCE GPR PE H ICA FORCE RLOG PE H 3Xxx 3xxXXx 3XxX g g g 4xxXX 6XXX 1lxxx 6xXXX 5xXXX g C C 0 0 0 ICA ICS PE H ICA MAS RES H ICAl SDB LD H 5XXX 5xxx 5XxXXx C C C 0 0 0 ICAS DIAG UNSTALL H ICAS5 IBOX UMARK H ICA7 DIAG RES 1lLAT H 5xxx 5xxx C C. 0 0 ICA7 FLUSH MAS RES 1 ICA7 FLUSH MAS RES 3 * * 5xxx 5XxXx 5xxx 1lxxx 11xxx g C g g g 0 0 0 0 0 ICA7 ICS PAR ERR H ICA7 UPC MAS RES H ICB ERR STALL H ICB IBUF PE H ICB IDRAM PE H * 11xxx g 4 XXX 4xxx * * * 6xxXx 6XXX 6xXxX 6XXX 6XXX (Cont) ICA FORCE AMUX OPAR H ICA FORCE BMUX OPAR H C g C C 0 0 0 0 0 ICB RLOG PE H C 0 ICB8 MAS 0 0 IDP IAMUX PE H IDP IBMUX PE H ICB6 ICB6 ICB8 ICB8 FRC RLOG PE LAT RLOG PE 3LAT H FLUSH MAS RES 1 FLUSH MAS RES 3 RES 1 H * 1lxxx * 11lxxx g g 0 0 4xxx g g g g g 0 0 0 0 0 MAPL TAG PERR H MAPL TAG W PERR H MAPN LD CLK3 Tl1A CH MCC MBOX CS PE H 15xXX 2XXX 12xxx 12xXX 12xxx g C g g t 0 0 0 0 0 MCC MBOX INTR H MCC1l MBOX MAST RESET H MCC1l STACK ERR H MCC7 MBOX FTL ERR H MCCJ ARRY TIMER BSY H 12xxx 2XxXX 13xxx 2xxx 12xxx g g 0 0 g g g 0 0 0 MCCM MCCM MCCM MCCM MCD3 * * 11xxx 11xxxX * * 12xxXx 12xxx 13xxx 11lxxx * * * * * * * * g g IDP IAMUX ERR CODE 0 IDP IAMUX ERR CODE 1 IDPD CLK6 PHASE TOB H CYC ERR SUM H DIS MD DRIVERS H HLD ERR ADR REG H HLD ERR DAT REG H ABUS DAT PERR H D-41 - VAX8600/8650 KEEP ALIVE FAIL Table 14 (KAF) FLOW AND SNAP FILE DESCRIPTION Key SDB Signals (Cont) Expected R COOOC O ot fd ot ek O Pt bt ot et ot el ol S S b e ot fud i e Signal W O G SR W Name O S NN D SAD ECC NN DD SOh ae CORR U SORN e G ERROR N T Y AR H ECC ERROR ANY A H ECC ERROR FATAL H CLR A H VCO CLR B I NIT H H -CLK RESET 141 A H -CLK RESET 141 B H -CLK RESET RESET 141 141 C H C H RESET RESET RESET -CLK -CSAT CLK6 -CSAT CLK6 141 D 141 D 141 E PHASE PHASE H H H TOA TOB H H -CSAT CLK6 PHASE TOC H -CSAT CLK6 PHASE TOD H -CLK -CLK -CLK -CSB CS -CSB ECS -CSBR PAR OK A PAR FLIP H ERR USTK H PAR H -CSBT CLK6 PHASE -CSBT CLK6 PHASE TOB CLK6 PHASE TOC -EBC CLK6 PHASE MCF RAM PAR TOD ERR -EDP RESULT PAR -EDPI FLIP GPRA H GPRB H -CSBT -CSBT -EDPI FLIP TOA ERR H -FBA FBOX A H -ICAl ENA AMUX OPAR H BMUX OPAR H PROBLEM -ICAl ENA -ICAl ENA GPR PE H -ICAl ENA RLOG -ICAl -ICA7 INJ ERR FLUSH -ICA7 -ICAJ MAS RES 1 H CLK3 TOB A H PE H 1LAT H MAS RES 1 H -ICAJ CLK3 TOC -ICAJ -ICAJ CLK6 CLK6 PHASE TOA H PHASE TOD H -ICBD CLK3 TOA B H TOC C H PHASE TOB -ICBD CLK3 et N -ICBD CLK6 Pt b b ot pd et et e et bt pt OO ot Qo fQQuUuuuu o0o0a0 Y ST O VR Qoo A e ofe oie ofie ol o S O QU N QOO0 S 0000 W uuauaao AR State LU R Note -ICBD CLK6é -IDPD B H H PHASE TOD H ENWBUS CLK3 TOC H D-42 R SR S e TN W VAX8600/8650 KEEP ALIVE Table SDB 14 (KAF) FLOW AND SNAP FILE Key SDB Signals (Cont) Signal A N W G SN G R SR I ek ot ot ot o b et ot -MCCJ bt bt ot e ok et e vauuauuaua -MAPR TB (- AN NS SN N S SO SN TSI WO N G AR SN W AR EVPAR CLK3 TOD H -IDPD WBUS C CLK3 TOA H -MAPN CLK6é PHASE TI1D H -IDPD U A QU W QU . — State | | i | I | | [ Expected Note N FAIL -MAPN -MAPN LD CLK3 LD CLK3 T1B A H T1C B H PERR H TOT CYC ERR H CLK3 Tl1A A H -MCCK CLK3 T1B B H -MCCK CLK3 T1D C H -MCCK -MCCM BYTWR CACH PERR H -MCCM DMA ERR H -MCCM SET BAD DAT CHK H -MCDT CLK3 -MCDT CLK3 Tl1A B H T1B B H -MCDT CLK3 T1C B H CLK3 T1D B -STOP ON PHS 3 -MCDT D-43 H WSRO Y O W W S DESCRIPTION VAX8600/8650 KEEP ALIVE FAIL Table SDB ~ (KAF) FLOW AND SNAP FILE DESCRIPTION 15 Key SDB Error Signals Expected ~ , Note State * * * * * 12xxx 15xxx 16xxx 17xxx 14xxx g g g g g 0 0 0 0 ABUS CPU BUF ERROR H CL CPU PF INTR H CSA PAR ERR H CSB HOLD ERR LTH A H o . o, 0 CSB USTK PAR ERR H . * * * * * 11xxx 11xxx 15xxx 4xxx 11xxx g g g g g 0 0 0 0 0 EBC EBC EBD EBD EBD CPU PF INTR LVL3 H MBOX INTR LVL3 H EBOX ERR LST CYC H EBOX ERR TO IDP H ECS PE FLAG H . % . | - * * * * * 11xxx 15xxx 11xxx 11xxx 11lxxx g g g g g 0 0 0 0 0 EBD EBD EBD EBD EBD ECS PE LST CYC H EDP PE FLAG A H EDP PE FLAG H EMCR PE FLAG H MBOX FE FLAG H * * * * * 11xxX 11lxxx 5XxxX 6XXxX 4xxx g g g g g 0 0 0 0 0 EBD USTK PE FLAG EBD WBUS PE FLAG EBE IBOX ERR LTH EBE IBOX ERR LTH EBE IBOX ERR LTH * * * * * 10xxx 14xxx 14xxx 12xxx 10xxx g g g g g 0 0 0 0 0 EBE EDP FBA FBA FBA * * * * * XXX XXX 6XXX B6XXX 3xxx g g g g g 0 0 0 0 0 FBM CS PAR ERROR H FBM FDRAM PAR ERROR H IBD BUF DRAM PE H IBD BUF IBUF PE H IBD4 DRAM PE H Y * * * 3xxx 3xxx 11xxx g g g 0 0 0 IBD5 IBUF PE H IBDA ERROR SAV H ICA ICS PE H o _ * * 5xxx 11xxx g 0 0 ICA7 ICB * * * * * 11xxx 1lxxx 6xxx 1lxxx 1lxxx g g g g g 0 0 0 0 0 ICB IDRAM PE H ICB RLOG PE H | ICB6 RLOG PE 3LAT H IDP IAMUX ERR CODE 0 H IDP IAMUX ERR CODE 1 H * * * * * 1lxxx 11lxxx 12xxx 12xxXx 11xxx g g g g g 0 0 0 0 0 IDP IAMUX PE H IDP IBMUX PE H MAPL TAG PERR H MAPL TAG W PERR H MCC MBOX CS PE H g Signal & VC# Name - | | —- H H A H B H C H - ~ IBOX ERR LTH D H OPR PAR ERR H FBOX PROBLEM B H FBOX WRITE PROB H FWBUS ABORT H ICS PAR ERR H IBUF PE H D-44 | —~ | | TMy TM B R o | ~ | . " — | | ~ g —_ | VAX8600/8650 KEEP ALIVE FAIL (KAF) FLOW AND SNAP FILE DESCRIPTION Table 15 Key SDB Error Signals (cont) Expected Note Signal Name OO O0O0OO0 MCC MBOX INTR H MCC1l STACK ERR H COOCOO mmmm”mmmmmmmm“mmmm“mwwmmwnmwmmfl, MCCM MCD3 MCDM MCDM MCDM MCC7 MBOX FTL ERR H MCCM CYC ERR SUM H MCCM HLD ERR ADR REG H HLD ERR DAT REG H ABUS DAT PERR H ECC CORR ERROR H ECC ERROR ANY A H ECC ERROR FATAL H et ot et et ek ECS PAR ERR H MCF RAM PAR ERR H RESULT PAR ERR H FBOX PROBLEM A H -CSB -EBC -EDP -FBA et ot et o U vuuauuaua wvaa QQ O QU * ¥ ¥ ¥ * * % ¥ * * ¥ ¥ * * * % ¥ * ¥ R NG T S State -MCCJ -MCCM -MCCM -MCCM -MAPR TB PERR H TOT CYC ERR H BYTWR CACH PERR H DMA ERR H SET BAD DAT CHK H D-45 o c,fimw/wm* D i, APPENDIX E EBOX ERROR ARBITRATION NETWORK The Diagram. This Appendix contains a Table and a Functional Block and Relative Priorities of 1lists the Microcode Vector Addresses Table The Functional Block Diagram all Interrupt and Exception Traps. illustrates the logic 1in the EBox that arbitrates, prioritizes, and for VAX8600/8650 error conditions. generates the Microcode Vectors is a key block diagram because, with the exception of EBox and This MBox Control Store Parity Errors, every Error Detection Network in the VAX8600/8650 Appendix Appendix Appendix Appendix Appendix F G H I J (listed below) converge here as input. - EBox FBox IBox MBox SBIA Error Error Error Error Error Detection Detection Detection Detection Detection Networks Networks Networks Networks Networks Thus, using these -Appendices as a set you can, figure out the exact in the VAX8600/8650; that cause each type of error conditions determine how that type of error is reported to the EBox; and see how the EBox arbitrates the error and generates a Microtrap Vector From there, you can go to the Error Handling Microcode (EHM) address. Flows (Appendix A) and see how it handles that type of error. Then from the EHM Flows you can go to the VMS Machine Check Flows (Appendix C) and see how VMS responds to the error. 1looks error the Chapter 5 and see what translated into an ASCII Report. And finally, you can go to 1like after it has Dbeen EBOX ERROR ARBITRATION Table E-1 NETWORK EHM Trap Vector Addresses Exception Priority Main Sub Priorities Micro Vector 0 0 0 0 og* 0g* and And Microcode Relative Vectors Exception Type MBox EBox EBox Fatal Fatal Mem Error Error Req in Prog and EB Port Stat - - - 0 1 0 - og* 8 - - TB - 0 0 0 1 1 1 0A 0B oc 9 - - - oD J 0 1 1 0E OF 1 0 10* - - 1 1 1 1 1 1 1 1 - 1 1 1 1 1 1 1 1 0 10* 11 12 13 14 15 16 17 - Normal 8 9 A B C D E F - TB PE : FBox Write Problem TB Miss Access Violation M Bit Not Set I/0 Physical Address Write Access Page Boundaries Unaligned Reference 2 2 0 0 1E* - - - 3 3 1 2 Reserved 01 g2* Interger 09 Normal (no D - E Reserved IB OP~-PORT-WRT-Stall and IB OP~-PORT~-WRT-Stall and OP IBox Error IBox Error (no FBox IBox Exceptions Reserved MBox (MEAR) Full Console Halt Pending Internal (MBox) Interrupt External (I/0) Interrupt Trace Pending 4 0 18* Fork Stall and IBox Fork Stall and IB Normal (no problem) Stat - - 1 18* 8§ - - TB - - 4 4 4 1 1 1 9 - 1A 1B 1C A - TB Miss B - Access Violation C - Reserved D - 1I/0 Physical Address 1D n/a S 0 1F 10* 0 - - Write - Unaligned Across Page Boundaries Reference Read IMD and Read IMD and 1Box OP Error Port - - - 5 S 5 S 1 1 1 1 0 -~ 10* 11 12 13 8§ 9 A B - TB PE - FBox Write Problem - TB Miss - Access Violation 1 1 14 15 Normal (no Stat <3:0> C - M Bit Not Set D - I/0 Physical Address 1 1 6 0 18* EBox ID Read 7 0 18* EBox Read String and IBox EBox Read String and OP E F - Write Across Page Boundaries - Unaligned Reference and IBox Error Error Port Stat - - - 7 i 0 - Normal 7 7 7 7 7 7 18# 1 1 1 1 1 1 8 - 19 1A 1B 1C 1D 1E TB 9 A B C D E - FBox Write Problem - TB Miss - Access Violation - M Bit Not Set - 1/0 Physical Address - Write Across Page Boundaries 7 1 iF F - See Unaligned marked with an asterisk call Appendix A. All other vector in this manual. (no <3:0> = problem) PE Vectors discussed = problem) 5 5 16 17 = PE E EBox <3:0> Reserved F EBox 5 ) IRD time) Error Port 4 1 ' S E - . Overflow Problem - E Stat (handled at 04* 05 06* 06 07 - Port (during IBox CPC Sync) (during RLog Unwind) - W Error problem) 4 5 6A 6B 7 - = - - 4 <3:0> ' - Write Across Page Boundaries Unaligned Reference F 3 3 3 3 3 0 = Reserved Miscellaneous 4 <3:0> problem) PE A - TB Miss B - Access Violation C - Modify Bit Not Set 1F* - Priorities the s E-2 Reference Error call Handling micro Microcode routines that (EHM) . are not 10 vi-TINdH1T = Hd3VYHLN Ei b 11 #81SDN 018¥13Sd YXO18-W 4234 OVIOO—18HONId1vV1HSLN£ LJ (e AelLIHOIN)TT ALLIIHHOOIINYGd))gA (L a © S3 4/e4 a83oI dvuiINg { H1vY 33IddOX4NN8WlABSIH Li ~ dVHLNONIMNAN\ oGNimnN3 ae3 83 MY dVHLN WX08W53831710410083»WwXdo[1dgI0O-W2m]1DfHXiIM0u8»1N4uIS19D0D0dAHdDd-DLDTGSI-LBXN0\8S¥4Y3gO0O43dv4dd¥,VHHiLnNe»wEoEmd&voul {ALI¥O9IYd! MHX0»183! W3vbda! 41.385410,N mDdAvHI1{inCY-—dQO4v1W0Hw¥V1I—NAm 5N140uI9OMd83 4/4 g83 ALIHOIMG! (1 . Q! 01 SN840 - EBOX ERROR ARBITRATION » 1ALIHO9Hd! J18A-1D dDXv0H38in41N QT/JYMe3I1H4 i €1 fmdv1.iSmnd._. NETWORK XogdJ0agUOTRIJTQqAYHAOMISBN3Iaed)Z30(Z vDA3TAdHVJYHIN C 41 ATHY3ZD23AdvHiN J H1T —fdvHinHO1DJ3AZ 3 ot ai e YT YY) Q @ DRIO o Mg Wl 0 m ERRREN — < ey — i= Oe N ML 48 o0 TV VY - DGO e No TURTE ] m 4]o W 0A23TAdHIVYL3N asg3 3 O O O W3TBOHdONO S140d Nivis Q83 OO M - o W DO ¢18HV1OdS d|1VH—10Sd d|1—VH01OSd 1|84¥01!OSd |—H 180HV—Ol1dS l £ 9 13S I 8 l 1 H O d 1 V 1 S 0 = 3—Y 3 dVYH1N o0 as3 idd DWW Hiw L BSO5 e l a -|Y3dVHLN lSa3ga? |1|111 OOLLLt OOLI||Z£ IO}SOLLO LLO O0JLTIS8NJHJX1NLOO30HEISN8OOBAL4IISdHLLVOFIII3WINHDHVOL3OLVIIITIdSHLH8AZOMIHS0AHOHSHJWAHM3SdHIIHOINI83OGOOV0HONVAHNYddNAYNHONNOB ag3 XSN0L8HN4YSOII8TH0D4Id3dANHH4iI0ddJd4VdVd¥VVHH4HH.ILI1NNNNLgS}LZI~h=l=— ERRORARBITRATION ) 893MYdYHL1N- dvdinHOlYdZ_ i14041V1S¢— ATHY3dvHiN03A1J aa3 d803 11H0dO 11V1S ¢—C— 4801 11HHOOdd L1VV11SS £-8831 11HHOO0dd1LVV11iSS0| -— 40 140d 1V1S —0 EBOX o] HI1—fdvyin O123A€H " NETWORK (o, M, Ay iy v iy | alw L Vil i APPENDIX F EBOX ERROR DETECTION NETWORKS This Appendix contains a set Functional Block Diagrams that describe the EBox Error Detection Networks. The output of these networks go to the EBox Error Arbitration Network where they are prioritizd and used to generated an Error Handling Microcode Trap Vector Address. Table 1 (below) lists both the error conditions and the Figures that describes the networks that are used to detect the error conditions. Table 1 EBox Error Detection Networks Figure Error L W F-1 F-2 F-3 -4 -5 S SN WL EBox EBox EBox Late EBox EBox R Detection Networks SO O N R S R S O R G N N BN A A N I AMux and BMux Parity Generation Operand Parity Error Detection Network Result Parity Error Detection Network and Parity Error Last Cycle WBus Parity Error Detection Network Micro-Stack Parity Error Detection Network mFEXdWNBEAxNogdXnpypueXnwdAjTadeQdduPorjerauadn1VivOo—HiVd17v1Sb= 0¢\d VN31v3d 9_|(snam/Avs9ona/vao)1 € 3 8! 310N {1 3LA8 O Ald Q10H~ 2 DWA av01 H0AXN1lAdY XNWE ALd 0 HL1 z£ VSIO-31LA8ZE€ H¥vVd SN"3OJH31O4AL38Q 4ad i AVSOWA 2 8 1 £ v — 3 0 i 4 S |9(sn0am/snado/ay9ao) H0XANIlWHOd1WAT z dad A|ld 0Y z o2lL 0XNWO £XNWD At DETECTION HdD V P+ga1 0FJHLLONON|IAVZSIITOY-NO31IAW8NO0I1SHVd H L T — f M V A l d H I 1 0 3 1 4 8 | T w N O I S DIWSsANS8nXuda4dNdmO0&EXDID ~ov 1v3dDWAAld0 AVSDWA 3 8) 310N {1 31A8 G Ald ERROR YHdO AVSOWA EBOX NETWORKS — i R L g i iy, it ey i TM o _H0VdM 1 " e oMb, _— nv g TM o EBOX ERROR DETECTION NETWORKS 3H1O8-HVYN3 “H110 Ald XNWE~ AL Sooonsa0 4 303148XNWB N y 170aLdxNiNE ALd8HdD Loil)~ 0488udD -t W48l S13D SNELO = Z1 m.mlwh ¥vd40 am1 Wit | Ll da3 40 -y dQd H1T L o148 € 4O | OW3Z S i SNYHOHEL4VMIOA 1 £ d0d _Ilc_ SNEM + Y¥dO AXNlWdY 1J1034d ] B Lo I £ HY3E vivd...,A 2anb1g£-3xogdI[nsoyAjtaeqa0amgUuOT3ID83eg¥IOMISON 8o]9s3H1M8130DWI3AN1TL4E0141¥1S£—Q43H1PDl3um%3301/00N3'y€JuH'SS3€NNHG3\VVBIa1SAA8NClLl3addN7{0HdL_LAJ07L0AdEd1HDQ—IHLMYOJ3310UiMvIaN3I0<O0N'38>6x4/d4_7wmo45m0_41{34D1IVHM30O4M1HYd .l@vOIHMH3I S 40d - Ald 0 3LAS AVSOWA HOAXNTl1WdTWA 10 AVSOWA Lo g 31AO4NIl|VHdSD9 ¥y205N8H17 £D-LNWidANI ILN'dJENOIN 1LOHNd0tN a{vDoW}A G <FDJ38OH0ONM>T LAVSDEINA 4Qd) DIHM d4Qd |300N530 1LMH4O=HYISH O v N Y~ D30IU0NM3 £73D030E9NM3 g¥Hli1v£0SN8 SSvd OS5 H3L1S193H ALlHYd iNdNi L FHOTH8Y3NI LLA4HIKHMSS 1LM4V347310T 1 == 1L1S434IIvHHdSSMNMYYVO1ALNHH)OOM1HHEL4IH5—£ -& =| L341I8HSINNNHIONY31X4N8EV/A 1S437 d -I L - I S o -‘3O03dyHgIN0M3d @HAXOl1Wd8{.Z2 Q3NO3IHD pue ERROR J4HS QI3SN N£ZAld ke~ 91 TM 1SON LINS3IH HH3 EBOX £9°10 N4HSOI3VSNAgLDdS{HAl-dH1Z4Qd LAIHS~ DNA JATE 1HS)L4 1OWA g 1 4 H S M Y 1 4 3 1 1 = ¥1INS3HYmmyv‘%dyxuSymwN8m3&Hm1mmbx80Zm8x3mnmnuaaD3HM30IN3R16INS3Y bXeNWmSeAlddL1TIHNLTS3Y938M300N36¥-968g9y 348L}YSWNHIOVSINAGJDY¥1d4~I/H+S{INOD \=N3A3 ae3. L& AVSOIWNA0. 1437&JAT0 HOuH3 41 Gv0T OWA ey Ty DETECTION NETWORKS i, TM i, MWW»&\ g, dQ0d LBON% €1 dNvH—31 ot O Z4LAD WOITH WOITH g TR TR oy 9 O s R ' iy, Ay g i, . AT e A i TM by, i ity , iy i \3IHS D3um 0d 23uM \ |- 102 yovisn -- R - ama Lyg 9493|8vdHL1 | - \_da3 SNEMBvd53 — LD3FNIHOMYI M AP v | HVd¥1SN¥DIHD | e -2 D8uvaosnam 3[2041Nsnom G Si ~ NEMmNIH100rm.1fi8l3flh¥ovisn1viVv0iavaZ1LS4N[BmgNHsvYodV,.O|€08—1|019g0vdSNEMAI30HILa1, NIdVHLIOAT.03}LUISNIOVIQHY3 H1- sTnamcoabiSoN3EuMmNIz4Hi15SI0E0M\IVSH.NOVLISN0LH—YdaweSLNNEVMH3OMV3N3OAHD1TYJT=SHT1onSdCM\A1SNHYdH3 z EBOX ERROR DETECTION NETWORKS 2iy905 T gy iy g g APPENDIX G FBOX ERROR DETECTION NETWORKS This Appendix contains a Functional Block Diagram FBox Error Detection Network. The network produces that describe three outputs. the FBox Problem - This output goes to the EBox Error Arbitration Network where it is prioritizd and used to generated an Error O Handling Microcode Trap Vector Address. FBox Write Problem prevents (Cache). the MBox - This from output writing goes the to result | FBox WBus Abort - This output goes to the EBox the EBox from writing the result in a GPR. Table 1 FBox Error Detection Networks Frror Detection Networks W A M S FBox Problem Detection FBox Write Problem FBox WBus ABort the MBox into and and memory prevents 'FBOX ERROR GPR ERROR FBA CS ERROR FBM CS ERROR FORAM ERROR SELF TEST ERROR DETECTION NETWORKS -CLR ERRORS 1F8R \ fiafl = LTH FBOX ENABLE -t/ , FBR 1 T13-4 ULITO ULIT DIV BY ZERO DENORM PROBLEM LD FBR LTH ! FBR FBOX PROBLEM 1 RESRVE OP EXP EXT PROB T3 ~WREG XFER LD wums'j f““ ~ESTALL FBOX WBUS ABORT ~L} LTH FBR T13 . FBOX WRITE PROB LD MR %012 Figure G-1 FBox Problem G-2 Detection APPENDIX IBOX ERROR H DETECTION NETWORKS This Appendix contains a set Functional Block Diagrams that describe the IBox Error Detection Networks. The output of these networks go to the EBox Error Arbitration Network where they are prioritizd and used to generated an Error Handling Microcode Trap Vector Address. (below) lists both the error conditions and the Figures that the networks that are used to detect the error conditions. Table Table 1 IBox Error Detection Networks Figure Error A mmmmmmmmmmmmummm“mmmwmwmmmmmmmmmmwmmwmmwmmwwmmmmmmmmmm G A Detection H-1 IBox H-2 H~3 IBox AMux IBox BMux Networks Instruction Parity Parity Buffer Parity Error Error Detection Detection H-4 IBox AMux Error Code H-5 IBox AMux WBus H-6 IBox Control Data Store Error Detection Network Network Network Generation Generation Parity Error Detection Network H=-7 IBox Dispatch RAM Parity Error Detection Network H-8 IBox OPBus 9 IBox Rlog Longword Parity Generation Parity Error Detection Network 1 describes 14OHdD@8IaHnVbdOtgI-HXOgIUOTIONISUTI93IngA3Taeqda0a\1gUOTIDO3IdQ¥I0MION oL- Inagl-4nQg101 DETECTION 344081 -0|31A84n81(sasl) 04(3nS184)8l v 9L ¢ €A~ - 140 ¥d98! 0 6 8505 0a14ng 4n8IY1V018HY4O ERROR S IBOX NETWORKS B, - 0HE% H4N38 Ay Ay, e " iy gy, iy 1 oy, MW‘W'M} i, iy e, gy Y I iy, " - s o iy 2 S Ay, IBOX ERROR DETECTION NETWORKS Lwi L)IRS 4 0}~ 08 ¥HO HNI18 43 1SOWTV ~ 18 ¥HD HNI- - 11 ddal no1vi snem £4n8a 13S o) QW1 13S XW8 = go] Y893130 0 ar v 030001 YY3~Y43 340D SNAMm ’ LLYVSYy3Suy3NNEEMM30003003WWaQMMT0 H43- 3000 0L LYSNEMWJLAB HNI- YH) £28 L RCTy L Xnvi SNam Viva al H11 XNWv 133 ERROR °anbrigG-HXOgIXNWYSNEMBIRQdoT3eId&OHU05SH HNI- ¥HD £28 1SOWTV YH3 £8 — LSOWY HY3 <8 1SOWTVY H 3 L8 XNWY 023 L®@anbI1tSxOSgWZOITVWM~T—yYA~-HHDHY33=1C1H88X0-|dlXnK,mZym‘10u130330el5apIo‘Hm1)mUoT3jexauan 1SHNI-OWTYJHDY3€28£8 ~ IBOX DETECTION NETWORKS Ay 2 Wi , i y , o - S i o y , i I s t y i , TM ,w’”““"”"’*‘?&@; 1YTRR 9anbrg9-H XOogI[OIJUO)921035A3taeqgaoaguOTIV83eq¥IOMIBN Sb-0ZosmnvYW-S347}Nl L IBOX c | - S~ €4SHdVINOdYO A S ) ot]5 ZdSHNVIOdYOD HL4SNVIOdYOHBD ERROR DETECTION NETWORKS r i, - ERROR DETECTION NETWORKS oy Y P EE 3WvHdQlI xogiYyoiedsigWvdAjraedaoagUOTID9IagYIOMIBN ) « 0 ¢ c c ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ «( IBOX i 4SH3YS-W : 1 HN~y3QHi HiN3g 3LQg1T04IAnDg \ s' T4N8- -11 A1 3 NWm/\ @anb14gL-H g, o - W‘ Vm gf vudd <W dNd0Ol! 1v1¥vd18 i XO®\|iLH1VYVd 3£21 8 y viva~‘- 03218 s Yo ) 4 X08I z FA g , i l s y VvS)OSIAN|IvN3ON3LX38-&—o1—ao 4Ny138401 P491 13%3AS8 oO~UW3 A o014 |9 g1 I{HSd1VOS 40d 9074 90341Vv7E H1S— 135S o v Hvd 401 | 40 28 Hvd VVSS))OISSIIANN0L - 4O31A8ZHVdO — v | Hvd HvdO vs)osiwn0LdOId031A8£uvdo- ~5 |- © 0vs)351aWNn!oL.L=d0!l 4041N3y1A83SHdV0d01 O 4018 4 0 1 d O S N Y U M H Y d O VYNyUNN33JYLiX3voV 39047 i cx - IBOX ERROR DETECTION NETWORKS =sHYtS SY3W-H— TTTITT T T T w vivad APPENDIX MBOX I ERROR DETECTION NETWORKS This Appendix contains a set Functional Block Diagrams that describe the MBox Error Detection Networks. The output of these networks go to the EBox Error Arbitration Network where they are prioritizd and used Table 1 to generated an Error Handling Microcode Trap Vector Address. (below) lists both the error conditions and the Figures that describes the networks tha t are used to detect the error conditions. ‘ Table 1 Error Detection Networks MBox Figure Error W A OO~ U B WN - Networks Fatal Error, Error Reg Full, and Cycle Error Sum Generation Internal Error Sum 28 Generation Internal Error Sum 58 Generation == > = @ o) e b Detection R i bl b Pt bt bt b A | e A Interrupt G@neratlon Multiple Error and Held Error Generation ECC Error Detection Network Cache Tag Parity Error Detection Network Cache Tag WBit Parity Error Detection Network Cache Data Correction, Cache Data Parity Error and Byte Write CP Error Detection Network "CP Write and Write Data Parity Error Detection Network Write Byte and Byte Parity Error Detection Network CP NXM Error Detection Network CP Buffer Error Detection Network Control Store Error Parity Detection Network CPR Parity Error Detection Network ABus Address Parity Error Detection Network 'ABus Control and Mask Parity Error Detection Network ABus Data Parity Error Detection Network ABus Bad Data Error Detection Network Error Handling Flow Charts Chart Title A S A A Array Read Data Cache Read Data Cache Writeback Error Handling Error Error Handling Handling I-1 Flow Chart Flow Chart Flow Chart ] WMOvLAI 4~WAV ANOD-V MO1HV3Ld 3aMmovy MU H43 WNS 85 —— 1N ub3 NS 82 — ¥dD 1 4 HH3 HTid3 8y 1vQ HY3 vis \ xam-Mm,-=—muamw&sT-IXo15d1HWOYDATWeled‘a0agnf1l0a1@2"yERTP]Hoy‘TIngpuejdnazajuyuorje1a35s~1aH0udwa.lsvm»fim- N3U4OVO 1HHhHd3d ) : ‘) MBOX ERROR DETECTION NETWORKS £ 84D w43 el HY43d Lva HOVOD , — DA HOVD dJ O3S Yt3d LUM LVQ — — vnw EE) LYM8 LIH H 3d HY3d H43d LVQ HOVD 440 HOVD— H43 VINQ DAD 3NO wovoHous||42 ILIHMYING I-3 H1yOHV3D N3D03Y43NS —- 034 HHO0D 1VvQ HOVD — H1NI XO8W~- LVL4HY3(.—lmm, QUEOY: 9 H1T[ HY3 S1SLNVnG8aYAV1VNvT3vNH1LASNvMaVVWDa1AHHMa¥4YHw3OY3ddd A~H43NS ]JH——mzmiw1mw][ 11VVvI1iSSs€ON§H939343S43¥ a—}—us 9InBTg£-IXOgWTPUASIUII0IFWNSg7UOTIBIBUS) 840 Hy3 QH- Y43 ERTER T Wy L0091 HY3LNI8¢ WNS HdO ¢| ¥10-1V1S zH1 N3£ 840 —8 a4 H4DdDi“m3wmvg (I48 LaHmL11 Mal)| LviS|9N393 MBOX ERROR DETECTION NETW ORKS e TM I-4 IDAD4d3D3N4SVHNO3wVOyDV31dLoJHwAl3J—LS8*iKANVOV1m_]=l|gy)|HOV%DOviH43d840HYILNI=o WXN |gudg%¥4 M. = 7o) HOVHO3dVDG113SH1)|(DAMYOe 2anb1gy-IXOgW[RUADIUII0IAYWNSgGUOT3eIdUdH —V 4iAN1VAVNOV4LON8VM1H33d~:@anbty0g48GW-3IXOdWSTdTIHYT|NW1v1aan1g0ag_.plHgiau1l]e|4PoIOLS|4Hn8HA430agUO|T3‘eIWNdAIZUE0B9§d13HWNS857|W3ma4m(s VIVE| WV V3INOD BE:e) HWdXO3N S m38¥d~D e~ O = ms 4 -4 n £wL0a491 Nsnea(<0AS:LX01>@-,JI\_N2\Ty" 3‘@03ANVw3W1[TM=303HOUY3 N5.*0NODvHiOsX)i3v0XNAAAV.SS|HAxXVg—P3—L‘|‘‘.2anbt|g|AVH9-YSIN8X€QOédv—wW1vf1|3iJ3ldDd|I10—|Za2d3\uOagTs3IAD8lvI,8Q0}3WIvi0vMdoHLI1wBl"iNgaWQ—uoo3WwJ90230434H3ONH3N30L ave)HQv(Hvd S.1)NOVvA{HVsAOHnVXaH)Y@4YS0(N<AEvVZ0:H1YL€E> = v[X0— 03 AVHEaYvSsNYY1Zi£v0a H— > 8AVHsAOHnVXYaH)@d4yS0(N<SA8gNV10Y:H€G92€Y€> -— O HsngOX)g30(<A8V0'HsYt> NAS LX 223 . 01 — 3/4 Hu3 A0 MBOX ERROR NAS XL - (o]03 |vivdaQ‘H43 -AVHYSN80¥€— ‘ ‘ e » ¥93H5v — {sugz) 4] oava)VivaeNNAASS~X¥x0 293 NONOM3ZNAS ; szsv g1l WViv4 |We3 B \MW%,% DETECTION NETWORKS: PHEUNSL T M T “"%3\ £ ANV Gt Ry . - - b Al MW’% ey iy, " TM ooy M, it MMy sty oy - oy _— - ny, i oy " 0D0OMVAanbrgg-IxogWayoe=)dDey31gMmAj3taeqaoaguoT3IDBI8gYIOMISON TyANOMYIO | 1DV318\Im.wlnx s/7~| 1DOF—3d9v1 0HOXdH9 40 I-7 11 - 00 18¥M 118 M- \d 00 VA HVd 00 18 VYA \dAM 0 ¥X181IH . ) v 0HOAV1OD0VD0OH~QM€V1V'H8AV2€¢d"JAzvM\WEV—@Ian0brOiv1g1OLvL1i‘|0-mM:0IwLu_x1Jiod4Hvddw@sy0dse|—)LlvWObVLeylA("3|raed101D1-PO‘3V1HuVdOHT3IID83o8QqY@I|0-MIB0oNam|o‘135a03yam=1LOV¥H3d g, Bhie , 0D OVl Hvd H43 HO0Vv3d1~ H£31YAdIT8 HU|31vA~d3H LA 3dE NH4OY3V8~O 1ASL4MDAD ——= 4WNN8VLdABViLINHOMD - 1V HOVD 1IH WHly3dLHMHDi3Jd 3d WHI N SLird: HMLABS0|3d 04N HNI=V1va_..0s4N 8m1vaH3d 1 1SATOHOVDO : HM1ASZ3d N H11 m A SEUdE - HOVD 1vQ HH3d DETECTION bed HOVD 1AS HH3d H11 1V HOVYD AH ERROR 33114A88 TO ¥HVVdd WUH33 e— N3 HOVI 48HH3I - 1HIOS VdDQV3YAW —— Lb — 1¥1J0HdLHM~ 4OLM 384DHIVD.JAJ=J LAS1M40HY3 1@an2ba1nJbrOgT-I6-1XOxpduoUWMeg8Wd9D3s4a©Ay3goTe9)A3M1ealPMeuqeD‘93uIoITrA03IaMldFaBIiUROoQT)IA1O9aS3,9yT]oa3ee8)gge¥1Ij0SAyea~O¥q8M0g]OOAB33NUjOtTaIeOqB31I08aQgAIOMIB~N—p-HOVDVY.1VQOHOD034 MBOX NETWORKS: 3 A <1Z:€2>Z ¥Vd 18Vd “ $ T ° 8 T > _ V T U E > o) uvde'S| (—wfigf1gNi%1S»l&@”<ouez> SdVHOA|VQL€eR@mHOV‘DvZ:SZ¥48 dOW4Vd£ 2608wy Y€34V1A83d - | 4@anbigNOD€TZ1-1XIOVdWW\WXJNdOHAYa8F3IndI0aFUOTID93IdQ}I0OMISN 23S dD HOVD JAJ— NSH4OdHN8YvDH8Y3 g[d—3DLM I-10 | AH3S 4n8 INOQ HH3 HL1 —4D4N8H 3 4WX03N— 84d 9 W ON S34 N3 H43 I9@anb4NNiOOgDD8ZT-I€v |- XO0dWdDWXN I0AFUOTID93IdQ¥IOMION S4HN4av831 4NOD L 4D _ (y3y Y43\ MBOX ERROR DETECTION NETWORKS e ““""wfi H4dND38 [LTR S H11 g il g MBOX ERROR DETECTION N , 0 5avn4A—SN Alivd i | 1-11 H11 vav m H i 7 £l al aav 5 LHIANI Hvd 0 al}—eL LvdE <L0:0l> 8 N8 id Hvd Z — ~ET4T" Z[l Snav 9V Ni vd81N0—8av¥_ vd v iN0O 11d Hvd ¥ 0EL_6L2Z" S— 11d8¢ <l8VdLi-plITM 041vOd LNv8d agy —d i e £aLl I-12 al i £ vd € 4N8 vd ¢ N8 vd S N8 vding 9 YO3HD SNBYV HVd ssnnaayvvviivvdaSsHuAoQaVvm£.zuHYL@l1anbtg91-IXOodWsndvsoapyKjtaeqaoagxuozuwazv_QOvMaBuvdkWuSma SN8yAVHY3d SNav ALivd NI snav¥av1vaAld Hi1[—SN8VvNI vav MBOX ERROR DETECTION NETWORKS SHPUNGL ¥NvGd % e - 5 " g, g ) b Ay S oy ", iy, — g, §3 MBOX ERROR DETECTION NETWORKS €l I-13 @2anbtrg 81-1 XOodW | 1 3 8 s n a v £ SNav4dvLVQ— HVd 12anbig6[-IXodWsns3gsny\gImye.wil|pem9eqaNgmq\A|M3eTjldraeaWqeHdVddJIQO0Nal0aavl123adgygUzOuOT1I3D0H8Y3339a9oQQwgYI¥sIOnaOMvMIE3BmINBHyN3d A- 80°GTR1 - —wTLE" H£4vOdW 4O Hvd 0 dOW Yvd | daw uvd Z - MBOX ERROR DETECTION NETWORKS TM E,,‘.wmwmw%) oy r AAq%' FSrine, aow H —£1 0 :-L0 gy I-14 i viva Ol QoM ” LdNYH3LNL 135XO8W | 1.HOd 1D34H0D AVH YIH(Vv i| f Quom |V4JLIAYMHA 134H|8O90adsV8S 138 MBOX ERROR DETECTION NETW I-15 vO33H1gAIis8N4‘DH03NIHHOIDHV2ON9VG3D338[L0ALEH.I,A~N|i AawLmNII4V0d18idLyavX13OSBHWQOVH'iW3I¥dL4NI4d028,ww4103-|.u_s8mwOuw3Nm3vz3Y89LiV0N4F12GV3QL]3A.9w3SmHlO4V0O/QvN3\dV.i1vQdJvivvYaag10i11X3dIDOHHOO,IVVWDWG JMLIJLH8M 3LIVHAONVVDANI /=%1svi\auomN 13S DETECTION 3 I-16 | IHOV "WN3I IHOVD‘130 ALVd O4 1H0d ERROR ALINV—ld 130 NOI1VYH3dO | 1 N O I L V H L I B Y 3 L E M m H N I J H O V D ] | A A 1 1 8 3 I Y A V I I | ¥ I 3 H D 9 3 | |1IQ3HOMOD . —O3M1fAI8W 310N¢ r9 31A18HIALNINVd| 12Q3H4i. OM0 I H I V O a g v 3 d v i v a i s i s w ) | 3QAV1lA3E8YvI]dHL‘OM1V3D0 AV13|9834avayL1504A}HwOV.DamvI|3wym%w Id av|iyH3a1vd3vyav 31AI81”| B|] AJ8LIHM MBOX NETWORKS Ry - TM Py iy, IHOVD } i A, ; | "y iy g %‘ By, " MBOX ERROR DETECTION CACHE WRITEBACK wWD1 READ CACHE CHECK PARITY GEN ECC WRITE ARRAY WD2 READ CACHE CHECK PARITY GEN ECC WRITE ARRAY — I wWOD3 READ CACHE CHECK PARITY GEN ECC WRITE ARRAY WD4 READ CACHE CHECK PARITY —_— GEN ECC WRITE ARRAY | 1«_&<CACHE DATA\ Y PE | / INH. CACHE PARITY DET. SET HLD ERR ADR REG M J $ READ CACHE CHECK ECC SET MBOX INTER. | l oK SBE DBE Briar: | CORRECT SET WORD BDF | GEN ECC WRITE ARRAY n/ \ tast woro \v l / INVALIDATE CACHE SET LRU ENA. CACHE | PARITY DET. J ‘ Figure EXIT ’ I-22 MBox Cache Writeback I-17 Error Handling Flow Chart NETWORKS g, oy g, TM TM s, APPENDIX SBIA ERROR J DETECTION NETWORKS This Appendix contains a set Functional Block Diagrams that describe the SBIA Error Detection Networks. The output of these networks go to the EBox Interrupt Arbitration Network where they are prioritizd and used to generated a System Control Block Vector Address. Other than handling the Interrupt, the EBox does not get involved with Errors. I/O Errors are processed by the VMS Machin Check Handler the VMS I/O Driver Routines. There is one exception, and that is IO BUF generate ERROR. an MBox CP IO Fatal BUF ERROR goes Error. Table 1 (below) lists both the error describes the networks that are used Table 1 SBIA Error Detection the MBox conditions to detect which will, in CP turn, and the Figures that the error conditions. Networks Figure Error E mflmmmm”mmmwmWmmwwmmmmwmmmmmmmmwmmwmmmmmmflmw&mwmm“wummmmmm" IR R Detection to I/0 and Networks J-1 SBIA DMAI Transaction Buffer Error Lock Circuit J=-2 SBIA DMAA Transaction Buffer Error Lock J=-3 SBIA DMAB Transaction Buffer Error Lock Circuit Circuit DMAC Error Lock Circuit J-4 SBIA J=-5 J=-6 SBIA Fault and Local Error Detection Network SBIA Multiple CPU Error and CPU Buf Error Circuit Transaction J=7 SBIA Control and Buffer Address Parity Error Detection Network A0 31gYO0ITNDAT) 1NvdaD 8-01 » 801 - 8 5 0€ O/VAldH 3 nd 1vav0 Sk v ng ¥H3 X0 7 VWG HY3 Wiy .SHOS 474b31D|2018 1D|PELI8SW10YVWa13S HOHYIAHVNWdDN1VSvQA910¥k~I Y3 ‘ iH§3 0D1 - IYWa JUIND H O N Y 3 A H Y W I W N S 1 IVWa- yH3 3207 |- S DETECTION DO SNn8v YA HOHY3 r 1UNI¥1NO3WiL1=0t ¢JS “1no3awiL o S 28 oo1rbMmTIVIGGOV\.STYWO4/4YIVIINNGGHH¥33%¥2I0071 4u/:4mrmwoi1%iLwv3[a40sFzias|v4[/e4yscrigs4v/a4s Q/VAldHb=3IHHOOoHYT3AAMMYYWWNNSS0¥1013DA=4=mM|VYINGXO8IN VISVYWAuUOTIOeSURILI83Ing10 ovvesIdsS1bI3vV/a4sWQuotv4a/jso{,esVWuG~eaH1¥Y]3W%Q£2101aT830SHSsIN-jDALdnHYg3HOoauE03AaunNvddwDgN1vVS.aQ%X1o2009b1 Ekb7M|IHVYWT3ONXO08AWTDOt18OSH1Y13B1Mm2_Y ERROR 0¢ ‘ you3|V&S Ivin nav JOWSNEYYWGOHOMY3 YOHOUH3 IVWG H 3 X307 SBIA NETWORKS SIYNOyQG MS1nvd0a= @Zan-brtga SBIA ERROR DETECTION NETWORKS S{iR} 1nvd1a L0 ot avwa 13s AH1£OYHW2Y8N3S V XuaJYd0W37O O/V Ald WH3 iAHYWNS¥12 - 01 SNEYVWA3NOG|SneYo@- Fziesv[e0sc18s |vando1WLE} 8YWQ XO8W 28¥05 ¥AHOV1WAYDN3S o SS \.m.m.—l.81D10v4HOLYTYF Iigssnvulwna|3/4LIHM—V1BlVOrVo1vfasyrsi_.%US\_STNN81351IJNV3L|nvs|A5tT\(SHINENYE S . , DETECTION NdD 1VG 81 A= 8H1O0L1Vnv3 3F . 1 nv4 : ERROR MK L MG aoDvya3y1N8viSLvIVanvoH4vVLdT|-a6S/o60490n4btg1GN6V-04\w.mnYw\I(g!mS?3:TCnEeTdPTYpENu3Ne18S1[en1VOv4a0nTda18Sk1I801aI1SIIzs.4yUOeTrIOS®3IdQ¥IOM3IONmm@mm.wmmmo 1D<8135:y0~831>Wy81os1IrgNwisIvsdo3d4¢b[1~|-1w66N00o3A|3©f1ziSsaIsdAolad1TM2OS0 Fo\{s60ud-1o%nmv4wm \Wm= 1\z18 |9k.mSb WA%3M07Mn60@LINX170V4 X1LN3ID4DUN33dMyoyDVQ11l1Q8OiS¥wNOvNIdd4--LNSS3A3D S 1X819USI:HNIvd03S %S2Y0M1LON r60 LISnv4 £‘o >18s1L 10 s135\17nv4HOLYT185oL L3 Lnvs9INN3 SBIA NETWORKS vQV/AldA0NOr-ves|ava ~QD/3VA1l8dSY04ONODYRG10HOW.idJOW1 vas J3H 18S 14NOD /4 |4 ON3d 3HS Q3¥O1SAldYO— N ,& NdTHIaNvD3y9LldaAn%fibOormwgQ3940-LrSTHINDVAIldES\SoTAIvI2071THANVWnd|Ngs10]TSH1sLI3VNa1Dv3SaALlvdPue1Nd8InNd\SDH3IIn0g1I01ag3TNOAT)|W0 /3 aY UNION3d GaI40H 9z9z 9¢z 9|01Dz483NO}Sy=D :N dN: NIDdOA-DGS 185 HOHY3 1408V Ay . Prir SBIA ERROR DETECTION NETWORKS @anb1gL=YIS[OIJUO)pPueSaIPpPyA3TaedI10agUOT3ID838QYIOMIBN HOL.W34<0-1>S/7 oSy 1nd ¥/80S13S1 -0 NvEsS ovas %10 TUIND LALd vas "\ "Q34¥01S HHOLVDO3LHW4OI<TWOH:IQ1N/EvD>uH1vVVdA| =|~=—a{oNd3yAo3 TQQ/H/vVINAADllddAlYMd4O3H3 r.NIvemeyhSwu|J aQ33L4”EdOVo0Al1MsS N AldMO SBIA ERROR DETECTION NETWORKS ey SS MTBYIG N APPENDIX K MACHINE CHECK STACK FRAME BIT DESCRIPTIONS This appendix contains bit and field definitions for the registers that make up a machine check stack frame. It also contains bit and field definitions for the error handling status register (EHSR) and the console support microcode status register (CSM.STATUS). | Register Page ESC Register Page ESC Register Page ESC SFBCNT EHMSTS EVMQSAV EBCS EDPSR CSLINT [BESR EBXWD1 EBXWD2 K-2 K-3 K-8 K-9 K-13 K-16 K-19 K-22 K-22 17 18 19 1A 1B 1C 1D l1E 1F IVASAV VIBASAV ESASAV ISASAV CPC MSTAT1 MSTAT?2 MDECC MERG | K-23 K-23 K-24 K-24 K-25 K-26 K-30 K-34 K-36 20 21 22 23 24 25 26 27 28 CSHCTL MEAR ~ MEDR FBXERR CSES PC PSL EHSR CSM.STATUS K-39 K=-41 K-42 K-43 K-46 K-47 K-48 K-51 K-55 29 2A 2B 2C 2D 2E 2F DA CO 00 GENERAL PURPOSE REGISTERS OF 10 | TEMPORARIES 2F .30 MISCELLANEOUS 3F 40 CONSTANTS AF 80 RESERVED BF co CSM REGISTERS CE CF VAX 8600 STATUS REGISTERS . " DE DF VAX ARCHITECTURAL REGISTERS EF FO SCRATCHPAD STACK FF AR ARL MACHINE CHECK STACK FRAME BIT DESCRIPTIONS 26 23 20 18 SFBCNT SFBCNT ~ STACK FRAME BYTE COUNT (ESCRATCH LOC: 17 — T07) 31 30 29 28 27 25 24 22 21 19 17 % 15 14 13 12 11 10 09 08 07 06 05 04 03 02 O 00 NUMBER OF BYTES (HEX) IN STACK FRAME -— CURRENTLY 58 (DETERMINED BY EHM) W T T T This longword contains the EBox Scratch T T T | . is built number of Pad RAMS W ] [ U T e by the Error bytes (in HEX) (location 18 e e s N T | [llll‘l‘il Handling Microcode (EHM). It that the EHM will assemble in the through 1D). When the EHM is finished building the Machine Check Stack Frame, it calls the Interrupt Exception Microcode. The microcode pushes the contents of the Scratch Pad RAMS onto the Interrupt Stack and calls the VMS Machine Check (MCHK) Handler. The MCHK Handler will use the contents of this 1longword to process the stack but the SFBCNT will not appear in the Stack Frame Record written to ERRLOG.SYS. <31:08> RESERVED <07:00> BYTE COUNT Indicates the number of bytes, currently ~that the EHM has pushed onto the Interrupt 58 HEX-88 Stack. Decimal, | iy, K-2 MACHINE CHECK STACK FRAME BIT DESCRIPTIONS EHMSTS - EHMSTS ~ ERROR HANDLING MICROCODE STATUS WORD (ESCRATCH LOC: 18- T08) s | SERVice [enM SEavice REQUEST | ENTERED | ENTERED | Requesr , ' 15 , 14 | 13 l 26 27 28 29 30 31 eBox oo 12 25 24 EHM HAS STARTED EBOX £BO S 0 IBOX 09 08 BOX ' . GPRB , GPRA , GPR 1 10 MICRO-TRAP VECTOR ADDRESS i ) i i i FBA FBM ¢S , CS 07 06 FBOX 5 IBOX , DRAM _ DRAM 05 19 18 1BOX MEAR 20 21 ?fi%%fim gg%uany | 5814 ENTRY VALUD i 22 23 PARITY ERROR CORRECTION PROCESS 04 RESOURCE |ENTRY TURNED |vaLip | FOLLOWS JOFF ~ 17 mgcess V7 7/ SAVED (S 03 RT 02 01 PRIMARY ERROR CODE (SUPPLIED BY VMS) a2 i 0 MR 3015 This status word contains a modified copy of the Error Handling Status Register (EHSR) which is stored in Scratch Pad location ODA. The Error Handling Microcode uses EHSR to keep track of status during the error handling process. In addition, two other Microcode Routines use the EHSR to pass status to the EHM Routine. The Interrupt Handling Microcode Routine uses bit 31 to report MBox error interrupts; and the FBox Problem Handling Routine uses bit 28 to report FBox hardware errors. The EHM copies the contents of EHSR into EBox Scratch Pad location 18 just before it sets VMS ENTERED and clears EHM ENTERED. The EHM then calls the Interrupt Exception Microcode to push the stack frame onto the interrupt stack and call the Machine Check Handler. <31> - MBOX SERVICE REQUEST If the Interrupt Handling Microcode determines that it was called to handle an MBox Error Interrupt it will set this flag and call the EHM. The EHM will test this flag and, if set, the EHM will process <30> i the MBox Error. " | EHM ENTERED This flag is used by the EHM to detect a double error trap condition. That 1is, those cases when a second error trap occurs before the EHM Routine is able to finish processing the first error and pass control to VMS. This flag is checked each time the EHM routine is entered. If the flag 1is clear (which is the expected case) then the EHM Routine will set this flag and process the error in the normal manner. If the flag is set, however, indicating that the EHM Routine was in the process of handling an error when it was called to handle a second error, then one of two things will happen: l. If the second error was detected by either the EBox or the MBox Fatal Error detection circuitry, then the EHM Routine will loop at UPC 21. This in turn will cause a Keep Alive Fail condition and message and Snap Shot the the Console will Print the state of the system. ”Attempting to save machine state due to" ERROR" following "MACHINE DOUBLE MACHINE CHECK STACK FRAME BIT DESCRIPTIONS EHMSTS ERROR HANDLING MICROCODE STATUS WORD EHMSTS (ESCRATCH LOC: 18 - T08) 30 31 29 28 EHM VMS FBoxX ENTERED | ENTERED | SERVICE * | SERVICE REQUEST REQUEST 14 15 13 26 25 . 24 23 22 21 20 19 18 EHM HAS STARTED PARITY ERROR CORRECTION PROCESS Eggx , 1 EBOX GPRB , 10 E30X GPRA 09 , IBOX GPR FBM €S 08 FBA ~ CS DRAM _ FBOX 1BOX 06 05 04 07 DRAM _ 2. VALID [vALip 17 , 16 ‘ | PROCESS | SAVED | ABORT CS 03 MBOX m SBIA RESOURCE INTERUPT | SuMARY | 20 nor | TGaNED MICRO-TRAP VECTOR ADDRESS MEAR IBOX @ I| 12 27 mony - 02 01 00 PRIMARY ERROR CODE (SUPPLIED BY VMS) | FOLLOWS | OFF 3 2 . 1, 0 If the second error was detected by the IBox then the EHM will put a code of 5 in CSM.STATUS (Scratch Pad location: C0) and call the CSM.ENTRY.DE Routine. This will result in a Keep Alive Fail Condition and the Console will print the following message and Snap Shot the state of the system. "Attempting to save machine state due to" "CPU ERROR HALT" NOTE Because the state of EHSR is saved in EScratch 18 before the EHM Routine clears EHM ENTERED, this flag will always be set in the copy of the Stack Frame saved by the VMS MCHK Handler. <29> VMS ENTERED : . This flag is similar to the EHM ENTERED flag. It 1is wused to detect the case where the VMS Machine Check Handler is in the process of handling an error when a second error is detected. The EHM Check and the If Routine sets Handler. this flag just before it calls The Machine Check Handler processes the Machine the errors clears this flag just before it executes an REI to continue operation (or a BUGCHECK to halt the operation). a second error trap occurs while the Machine Check Handler is still processing the first error, then the EHM Routine will process the error in the normal manner. That is, build a Stack Frame, clear the error condition, roll back the PCs and determine if it should call VMS. However, - since the VMS ENTERED flag is set (indicating that VMS processing an error when a second error was detected), the EHM will not call VMS. 1Instead it will put a code of 5 in CSM.STATUS (EBox Scratch Pad Location: CO0) and call the CSM.ENTRY.DE Routine. This in turn will result in a Keep Alive Fail Condition and the Console will print the following message and Snap Shot the state of the system. was MACHINE CHECK STACK FRAME BIT DESCRIPTIONS EHMSTS EHMSTS - (ESCRATCH LOC: 18 - ma; 3 - 30 29 MBOX 15 l . 28 | 14 | 27 26 FBOX 13 | ERRCR HANDLING MICROCODE STATUS WORD 12 25 - 1} 10 uetfnwrmp vicm mcmless | 24 23 22 21 09 08 | | | .......27 MBOX 06 18 [SBIA , MEAR 1S 05 SBIA 04 03 *R‘E’WWG& to" 02 01 00 - ffi”";ufi”fw’fts? L PRIMARY ERROR CODE : "CPU ERROR HALT" FBOX SERVICE REQUEST The micro-routine that handles FBox Problems will set this flag if it determines that the problem was caused by an FBox hardware error. The routine will then call the EHM Routine to process the error. The EHM will test this flag, to determine if 27> 19 %xm“ s%?m FULLAP [TURNED NTERUPT "Attempting to save machine state due <28> 20 EHM HAS STARTED PARITY ERROR CORRECTION PROCESS it was called to handle an FBox error. FIX FBOX GENERAL PURPOSE REGISTER PARITY ERROR Set by the EHM when it starts to correct an FBox GPR parity error. <26> FIX EBOX GENERAL PURPOSE REGISTER B PARITY ERROR Set by the EHM when it starts to correct an EBox GPR | B parity A parity GPR parity error. <25> FIX EBOX GENERAL PURPOSE REGISTER A PARITY ERROR Set by the EHM when it starts to correct an EBox GPR error. <24> FIX IBOX GENERAL PURPOSE REGISTER PARITY ERROR Set by the EHM when it starts to correct an IBox error. <23> <22> FIX FBM CONTROL STORE PARITY ERROR Set by the EHM when parity error. it starts to correct an FBM | FIX FBA CONTROL STORE PARITY ERROR Set by the EHM when it starts to correct an FBA Control Store Control Store parity error. <21> FIX FBOX DRAM PARITY Set by the EHM when parity error. <20> FIX IBOX DRAM to correct an FBox Dispatch RAM Dispatch RAM PARITY ERROR Set by the EHM when parity error. (19> ERROR it starts it starts to correct an FIX IBOX CONTROL STORE PARITY ERROR Set by the EHM when parity error. it starts to correct IBox an IBox Control | Store MACHINE CHECK STACK FRAME BIT DESCRIPTIONS EHMSTS EHMSTS ERROR HANDLING MICROCODE STATUS WORD {ESCRATCH LOC: 18 - T08) 3 30 29 28 MBOX 15 l <18> 17> 27 26 25 FBOX 14 13 ' 24 23 22 21 20 19 18 EHM HAS STARTED PARITY ERROR CORRECTION PROCESS 12 1 10 09 08 Mi?RO»TRAP VfCTOR ADDRfSS ___27 ‘ 06 52{@, 52{{?: MBOX [SBIA INTERUPT | SUMMARY MEAR SAVED ‘ Indicates that the MBox (in EScratch Location: Register Full trap. IIEARV 05 04 03 gg&&&g gggueu SBIA ‘ RESOURCE | 02 01 00 . . (SUP:LCED BY Ws:Sl o PRIMARY ERROR CODE | Error Address Register (MEAR) was saved DB) as a result of the last MBox Error ‘ PROCESS ABORT The EBox microcode detected a condition which prevents a retry of the faulted instruction. See EBCS <19:16> for the reason code. 16> RESERVED <15:08> MICRO TRAP VECTOR ADDRESS Contains the vector address VECTOR through which EHM was REASON 2 FBox Error (Called by FBox Interrupt Handler) 4 EHM Detected a Process Abort condition during a 6 MBox MBox Error Error 8 EBox Error 8 MBox Fatal 8 TB PE Register (Called by Full Micro-trap. MBox Interrupt (EBox Port Request only) 10 IBox IBox 10 10 Op-Port-Write Op-Port-Write EBox EBox IMD Read and IBox Error IMD Read and TB Parity Error 18 EBox Fork 18 EBox EBox Fork and TB ID Read and - Handler) Error. Error 10 18 18 the and IBox 18 EBox String EBox String 1E IBox Sync Failure 1F Rlog Unwind and IBox and TB Error Parity Error Error Parity Error IBox Error Read and IBox Error Read and TB Parity Error Failure - entered. MACHINE CHECK STACK FRAME BIT DESCRIPTIONS EHMSTS ERAOR MANDLING MICROCODE STATUS WORD EHMSTS (ESCRATCH LOC: 18 - T08) - | esox 14 13 12 " | MBC | e 15 . <07:00> ' 26 27 28 28 30 31 ; 1" VECTOR ADDRESS MICRO-TRAP ‘ , 21 20 19 18 EAR 10 09 , ‘ A ERM | 22 23 24 25 EHM HAS STARTED PARITY ERROR CORRECTION PROCESS o706 08 T 05 , 04 E’fi'&%“"" g‘%"‘* is‘m RPT %%ggc) VAUD |vALiD | FOLLOWS JOFF 03 3., 00 02 0 2 1,0 {’fi%&%@?‘” REASON CODE the VMS Machine Check Handler. supplied by This field is field will be valid only after the Machine this " Therefore, This field will not be valid Check has been processed by VMS. Frames extracted directly from the EBox Scratch Pad for Stack See <03:00> for the actual Reason RAMs or the Interrupt Stack. COde . <07> MBOX INTERRUPT ENTRY VALID Indicates that the Stack Frame was generated as a result of MBOX 1D interrupt. <06> SBI SUMMARY ENTRY VALID Indicates that a copy of the SBIA Error Summary been appended to the end of the stack frame. <05> SBIA FULL REPORT FOLLOWS Indicates that a full SBIA Error Log entry follows in the System Event File <04> (ERRLOG.SYS). RESOURCE TURNED OFF Indicates that VMS disabled either the FBox or Caches. <03:00> PRIMARY ERROR CODE This field Code 001 010 011 100 indicates which Box detected the error. Bax Detecting Error FBox EBox IBox MBox (Fatal Error) Register has this entry one of the MACHINE CHECK STACK FRAME BIT DESCRIPTIONS EVMQSAV EVMSAY (ESCRATCH LOC: 19 — T09) 3130 20 28 27 26 25 24 EBOX VIATUAL ADDRESS MULTIPLIER QUOTIENT SAVE REGISTER 23 22 21 20 19 18 17 16 15 4 13 12 11 10 09 08 07 06 05 04 03 02 01 00 EBOX VIRTUAL ADDRESS (EVA} FOR EBOX PORT REQUESTS L i EVMQSAV i 3 (EBox 1 ] i i Virtual i i i 3 i i i L i i ] } } 1 1 Lo i § Memory Address/Multiplier Quotient ¥ 3 Save g 1 1 Register). During that this EBox port requests this register contains the virtual address was acknowledged by the MBox (PA ACK). During normal operation, registe is r used to temporarily store partial EBox results. <31:00> May contain an EBox Virtual Address EBox result depending on the or operation a of partial the calculated EBox. ,,1} TM ,,,X’ MACHINE CHECK STACK FRAME BIT DESCRIPTIONS EBCS EBcs EBOX CONTROL AND STATUS WORD (ESCRATCH LOC: 1A - TOA) 31 FBM CS 30 29 28 FBOX , DRAM 1BOX DRAM CONSOLE ECC CORRECTION REQUEST , FBA CS , 23 27 22 STACK FRAME 21 20 PERFORM 19 18 16 MONITOR This register is made up from the EBCS register (see EBD2, EBE5) and other EBox status bits (see EBEB and EBEC). <31:27> 17 EBD3 | and CONSOLE ECC CORRECTION REQUESTS The bits in this field are set by the EHM in order to interrupt the Console for Control Store or Dispatch RAM correction. When set, these bits will cause an immediate Console interrupt. The EHM will 1loop waiting for the Console to set "DONE" in RBUFC. When correction is complete, control is returned to the EBox by setting "DONE". The EHM will then clear the correction request EBCS <31:27> and thus release the Console interrupt. <31> FBM CONTROL STORE CORRECTION REQUEST EBE5 FBOX FBM CS PE o The Console will attempt to correct the parity error and then return control to the EHM by setting the "DONE" bit <07> in h RBUFC. <30> FBA CS CORRECTION REQUEST EBE5 FBOX FBA CS PE The Console will attempt to correct the parity error and then return control to the EHM by setting the "DONE" bit <07> in RBUFC. <29> FBOX DRAM CORRECTION REQUEST EBES FBOX DRAM PE The Console will reload the FBox Dispatch RAM and then return control to the EHM by setting the "DONE" bit <07> in RBUFC. <28> IBOX DRAM CORRECTION REQUEST EBE5 IBOX DRAM PE The Console will attempt to correct the parity error return control to the and then and then EHM by setting the "DONE" bit <07> in RBUFC. 27> IBOX CS CORRECTION REQUEST EBE5 IBOX CS PE The Console will attempt to correct the parity error return RBUFC. <26:24> RESERVED control to the EHM by setting the "DONE" bit <07> in ' MACHINE CHECK STACK FRAME BIT DESCRIPTIONS EBCS EBCS | EBOX CONTROL AND STATUS WORD (ESCRATCH LOC 1A - T0A) N 30 29 28 - CONSOLE ECC CORRECTION REQUEST 15 MBox | 14 13 MBO - <23:21> ; 19 18 17 : AOCESS ABORT MONITOR 16 REASON CODE 03 ES0X 1 esox 02 01 00 MACHINE | MEMORY STACK FRAME REVISION The stack current frame revision stack frame is written revision PERFORM MONITOR ENABLE - EBE5 PERFORMANCE MONITOR This is Pin 1is an the error and externally. It the 1. , is allows See handling has been and the ENABLE archltecturally software. A07) defined wired system to bit which a CPU is PMR description controlled backplane performance PROCESS ABORT REASON CODE If EHMSTS <17>, PROCESS ABORT, is reason why the faulted instruction 0001 by to indicate how many times the stack frame The previous stack frame was revision 0, - System <19:16> 20 PERFORM __ EBOX | EBOX microcode revised. <20> 21 FRAME 12 EBO 22 STACK for to be further set, EBCS cannot be pin by (Slot 09 monitored information. <19:16> 1is the retried. Unrecoverable GPR parity error EBox WBus parity error All IBox PC's are invalid | EBox failed to detect OPBus byte parlty error CPR parity error that did not result in an MBox 0010 0011 0100 0101 fatal error 0110 <15> <l4)> RLog parity error MBOX FATAL ERROR ERR6 MBOX FATAL ERROR | ~Set via EBox Port Status Line 0. detected one of the following Fatal ERR1 WR ERR2 TAG DAT 'ERR2 DMA PE MCCJ FTL CPR PE PE & & WRITE REG WBIT ERR MBOX INTERUPT PENDING EBC2 MBOX MBox Interrupt Indicates that conditions: the MBox Error ERR2 CP ERR6 CP BUFF ERR6 CP NXM IO PE ERR ERR INTR LVL3 ‘The Mbox generates an interrupt request when it detects an error of any kind (excluding TB parity errors). This bit is set by the EBox on the third occurrence of T3 after it receives and is usually handled at IRD Time. MACHINE CHECK STACK FRAME BIT DESCRIPTIONS EBCS EBOX CONTROL AND STATUS WORD EBCS LOC: 1A~ TOA) (ESCRATCH A 18 rii 28 29 30 | CONSOLE ECC CORRECTION REQUEST | iIBOX 1BOX FBOX FBA FBM . <13> DORAM , 16 ‘ REASON CODE DRAM , 13 12 mBox | aox |EBOX 1 | EBOX IBOX ERROR EBEA IBOX ERR LTH Set by the EBox when the IBox error conditions. ICBC IDRAM PE EBD7 RSV MODE ICB6 IDP6 IDP6 reports IBMUX PE IAMUX PE ICA7 ICBC one of the following ICS PE IBUF PE RLOG PE When written by EBox microcode, <13>. <12> 17 it will clear IBox errors | and EBOX MEMORY CONTROL FIELD RAM PARITY ERROR EBD2 EMCR PE FLAG Set when the EBox detected a parity error in the Memory Control Field <11> (MCF) RAM. EBOX CONTROL STORE PARITY ERROR EBD2 ECS PE FLAG Set when the EBox detected a Control Store Parity Error. Setting this bit results in a immediate trap to the Console for ECC Correction. When the Console finishes correcting the error it will force the EBox to trap to EHM (Vector 8). <10> "EBOX MICRO-STACK PARITY ERROR EBD2 USTK PE FLAG Set when the Ebox detected a parity error when last entry off <09> the micro-stack. it | popped the EBOX DATA PATH PARITY ERROR EBD2 EDP PE FLAG Set when the EBox detected either an Operand parity error or Result parity error. a MACHINE CHECK STACK FRAME EIT DESCRIPTIONS EBCS EBCS L EBOX CONTROL AND STATUS WORD (ESCRATCH LOC: 1A - TOA) 29 28 27 13 12 " k) 22 l cS 1 21 20 19 18 17 03 02 01 16 rem 15 fim 00 , MR <08> EBOX WBUS PARITY ERROR EBD2 WBUS Set when driving Iee PE FLAG the EBox detected the bus. a WBus Parity Error whlle it was not the NOTE l. 2. Writing 1 <12:09> and This bit to this bit clears has a different use mode. <07:05> <04:01> in diagnostic | VMS ABORT FLAGS these flags, recoverable. in part, to determine whether or EBOX ABORT EBD3 Set EB ABORT when Parity <03> <15>, RESERVED VMS uses error is <04> bits <4>. the FLAG EBox Error. detected an MCF RAM Parlty Error or a Both cases are non-recoverable. | MACHINE STATE MODIFIED EBD3 STA MOD FLAG Set by Microcode via the UMISC Field. 1Indicates that the of the machine has been modified such that, should an occur while this bit is set the in the EBox cannot be retried. <02> <01> | MEMORY OR GPR WRITE EBD3 MEM WRT FLAG Set when a Memory or GPR write the error occurred. operation operation currently was in state error executing progress when IO READ EBD3 Set IO RD FLAG when the error involved an I1/0 register read. I/0 registers automatically clear after being read of the I/O register may have been lost. <00> Result RESERVED K-12 Since some the contents MACHINE CHECK STACK FRAME BIT DESCRIPTIONS EDPSR — EBOX DATA PATH STATUS WoRD (ESCRATCH LOC: 18 - T08) 31 83 30 29 BMUX BYTE IN ERROR , 8 15 . B " 28 . 27 | B0 13 - B3 , B | . 83 12 VMQ BYTE IN ERRO B , 82 Bl AL 10 l e / - B0 26 25 AMUX BYTE IN ERROR PE | | 24 ., 7 23 B0 22 1 08 07 21 " 06 BMUX | BMux 20 19 EBO GP PARITY 1 i 17 16 I 3 3 i 05 01 WBUs | 0PBUS l ggww , ' 18 : ‘ ‘ WBUS PE 00 - GPR B PE MRA-1381Y This register <31:28> is made up from the nibble registers in the PDP MCA. BMUX BYTE IN ERROR PDP4 BMX ERR BYTE <3:0> (EDPH) This field is only valid when either bit <07> or bit <00> is set. This field indicates the byte(s) that were associated with the BMux Parity Error. <27:24> AMUX BYTE IN ERROR PDP3 AMX ERR BYTE <3:0> (EDPH) This field is only valid when either bit <01> is set, or when bit <02> 1is set, or when bit <03)> is set and bits <00:02> and <06:07> are reset. This field indicates the byte(s) that were associated with the AMUX Parity Error. <23:16> EBOX GPR PARITY ERROR ADDRESS If EHMSTS <26> (EBox GPR B PE) or <25> (EBox GPR A PE) these bits indicate the failing GPR address. <15:12> VMQ BYTE <05> ia set and bit <08> that were associated with is the WREGISTER PARITY ERROR PDP6 WREG ERR (EDPH) Indicates that the parity at the input to the WRag Mux did match the parity generated at the output of the WReg. inputs to the WReg Mux are: o The AMux o The BMux for WReg Format Operations. o The ALU and BMux for WReg for post ALU Wreg Shift Operations. Shift Operations. NOTE This error will also cause a RESULT PE. <10:09> set, IN ERROR PDP5 VMQ ERR BYTE <3:0> (EDPH) This field is only valid when bit reset. It indicates the byte(s) Result parity error. <11l> is RESERVED K-13 not The MACHINE CHECK STACK FRAME EIT DESCRIPTIONS EDPSR EDPSA EBOX DATA PATH STATUS WORD | (ESCRATCH LOC: 18 - T08) l 3 B3 18 3 29 BMUX BYTE IN ERROR , 82 14 , , B 13 28 27 B B3 12 1" 26 25 24 AMUX BYTE IN ERROR B , 10 09 83 , 82, B , B 22 21 08 o7 06 e | P PE 20 19 18 EBOX GPR PARITY 05 04 03 17 16 N = . N . \ \ B0 , Bl , 23 I ; | 2 o 00 - PE PE PE | S <08> EDP MISCELLANEOUS PARITY ERROR PDP8 EDP MISC (EDPH) Indicates that the ALU was the input to the VMQMUX when a parlty error was detected at the VMQMUX output. When this bit is set the contents of bits <12:15> are nmt valid. This error will also cause <07> a RESULT PE. | BMUX WBUS PARITY ERROR PDP8 B WBUS (EDPH) Indicates that the WBus was the input to the BMux when a parity error was detected at the BMux output. Bits <28:31> indicate the byte(s) <06> in error. BMUX OPBUS PARITY ERROR PDP8 B OPBUS (EDPH) . A Indicates that the OPBus (which 1is protected by 1longword parity) was the input to the BMux when a BMux parity error was detected at the BMux output. When this bit is set the contents of bits <28:31> are not valid. ‘ <05> V RESULT PARITY ERROR PDP8 RSLT CHK (EDPH) | | If neither WREG PE <11> nor EDP MISC <08> are set, this Dbit indicates that the VMQSAV Register was the input to the VMQMUX when a VMQMUX parity error was detected. Otherwise, this bit is the "or" of all three of those conditions. <04> “RESERVED <03> OPERAND PARITY ERROR PDP8 OPER CHK (EDPH) If bits <02:00> and <07:06> are reset, then this bit indicates that the VMQSAV Register was the input to the AMux when an AMux Parity error was detected. Otherwise, this bit indicates that one <02> of the following errors were detected: BMux GPR B PE AMux GPR A PE BMux WBus PE AMux WBus BMux OPBus PE EBOX GENERAL PURPOSE PDP8 A RAM PE REGISTER A PARITY ERROR (EDPH) Indicates that Scratch Pad-A was the input to the AMux parity error was detected at the AMux output. K-14 when a MACHINE CHECK STACK FRAME BIT DESCRIPTIONS EDPSR EBOX DATA PATH STATUS WORD (1] H LOC: 1B - 108) (ESCRATC N - 83 15 - 30 29 B2 Bt ) 13 f ERROR m BYTE IN BMUX , VMQ BYTE IN ERROR ., 28 27 B0 83 <01> 25 24 82 Bt B0 1 | i S S0 e 04 | ‘ 7 03 02 01 00 Pe PE PE EBOX l 4 % L L 3 16 Rl 18 19 20 21 05 06 07 smux | emux | gPeus W 12 83 , 8 , B , 80 22 23 6 BY IN ERROR AMUX BYTE | | GeRe EBOX GPRA | weus AMUX WBUS PARITY ERROR - PDP8 A WBUS (EDPH) Indicates that the WBus was the input to the AMux when a parity error was detected at the AMux output. <00> | EBOX GENERAL PURPOSE REGISTER B PARITY ERROR PDP8 B RAM (EDPH) Indicates that Scratch Pad-B was the input to the BMux parity error was detected at the output of the BMux. K-15 when a MACHINE CHECK STACK FRAME BIT DESCRIPTIONS CSLINT CSLINT CONSOLE AND INTERRUPT STATUS WORD (ESCRATCH LOC: 1C - T0C) 31 29 30 2 8 27 26 CONSOLE HALT CPU POWER MBOX INTERVAL 13 12 1 10 % PENDING , FAILURE , ERROR , TIMER 15 14 25 24 23 (Console and 21 10A WITH HIGHEST CONSOLE , RLO2 CONSOLE CONSOLE 09 08 07 — Interrupt Dbk 20 19 1, 0 [1=WIT 06 05 04 <29:23> 3, 2 -, 1 03 02 01 CPU , 0 00 1=WRCSLl a5 = M . a3 ., w2 S80S W CBUS ADDRESS |o=RocsL Register) in this field will System‘ Interrupt result in a CPU 1nterrupt. HALT PENDING CSL HP Posted by the console. Indicates that the console interrupt the EBox and force it to execute a Console Ebox micro code will be forced to the CSM wait loop. wants Halt. to The CPU POWER FAIL EBC2 CPU PF INTR LVL3 Posted by the console. console about an approximately <27> | | INTERRUPT CONSOLE EBE3 <28> 16 RESERVED Setting a bit <29> 17 EBOX IPR STATUS This is a two part register. Bits <31:16> reflect the Status and bits <15:00> reflect the CBus Status. <31:30> 18 | INTERUPT INTERUPT REQUEST | SOURCE 0 = EXT , RECEWE ,TRANSMIT| lur,wm,aslm,na,nz,m,m CBUS DATA CSLINT 22 KAB600 INTERRUPTS MBOX 300 ms to Indicates that the power impending shut down in an EMM has notified failure. VMS the has orderly manner. ERROR EBC2 MBOX INTR LVL3 Posted by the MBox. 1Indicates that the MBox detected an error of any type (excluding TB parity errors). The interrupt is handled <26> <25> by the EBox INTERVAL TIMER EBE9 TIME INTR LVL3 Posted by the EBox. has overflowed. at IRD time. 1Indicates (Same that the as Interval CONSOLE RLO2 EBC2 RL INTR LVL3 Posted by the console. Indicates that the interrupt the EBox to transfer a byte to or During an RL0O2 read (SNAP bytes and then transfers this interrupt mechanism. EBCS <14>). Count console Register wants to files, etc.), the Tll assembles them to the EBox a byte at a time 512 via K-16 from the RL0O2. o, MACHINE CHECK STACK FRAME BIT DESCRIPTIONS CSLINT CONSOLE AND INTERRUPT STATUS WORD csLinY (ESCRATCH LOC: 1C - T0C) 27 R 15 o7 14 | , O 13 . ©0 2 , 26 KAB600 INTERRUPTS MBOX ,_ERROR INTERVAL 10 09 08 D1, DO , D2 , 23 CONSOLE CONSOLE | RL0? 1" D3 CONSOLE 24 _ TIMER BUS DATA 04 25 22 10A | RECEIVE Imnsn 07 CLOCK 2! 20 WITH HIGHEST | INTERUPT 19 INTERUPT REQUEST| SOURCE = EXT 1, 0 J1=mr 06 05 04 45 . L M 1=WRCSL| 18 17 16 EBOX IPR STATUS PR R ’ I 03 02 01 A3, A2, A, CBUS ADDRESS 00 AD R 1318 During an RL0O2 write, the Tll interrupts the EBox for a byte of data until it has assembled 512 bytas. It then transfers the data <24> <23> - to the RL02,. CONSOLE RECEIVE EBC2 TRX INTR LVL3 Posted by the console. Indicates that acknowledged the last CPU transmission. the console CONSOLE TRANSMIT EBC2 TTX INTR LVL3 Posted by the console. 1Indicates that the console wants transfer a byte of information to the CPU via the CBus. <22:21> <20> IOA WITH HIGHEST INTERRUPT REQUEST EBC1 EXT INTR SRC 1:0 The I/0 Adapter (SBIA) with the highest INTERRUPT SOURCE EBC3 INT INTR has to IPR. (O0=EXT/1=INT) Set by hardware. When set, indicates that the source of the pending interrupt is internal. When clear, indicates that the source of the pending interrupt is external. <19:16> EBOX IPR STATUS <03:00> EBC3 ACTIVE IPR 3:0 | This field represents the least significant four bits of the highest active hardware (internal or external) interrupt request during the previous machine cycle. When this field is zero, there was no hardware interrupt. <15:08> CBUS DATA <D7:D0> EBE2 CBUS IN REG D7:D0 Although this register the are 07> is used primarily to receive console, all data transfers latched in this register. between the data from console to the CPU CBUS CLOCK EBE2 CBUS CLOCK This bit represents the state of the CBus Clock Llne. The CBus Clock must be asserted by writing a one to this bit and negated (in a subsequent microinstruction) by writing a zero to this bit in order to complete a CBus read or write operation. K-17 MACHINE CHECK STACK FRAME BIT DESCRIPTIONS CSLINT: CSLINY (ESCRAICH LOC 1C - T0C) N 30 S CONSOLE AND INTERRUPY STATUS WORD 29 8 o 27 26 25 24 23 KA8600 INTERRUPTS t;,&;jf,;; i oA WAL POWER MBOX - INTERVAL CONSOLE CONSOLE CONSOLE s ~ L0 2 15 J CONSOLE ) cPy PENDING | FAILURE 14 13 12 | ERROR | TIMER — RL02 10 09 1 , RECENE , TRANSMIT 08 07 06 CBUS CBUS R'W CBUS DATA o , <06> ©o 05 , o4 CBUS R/W EBE2 03 , D2 , DL 22 ., D0 21 20 19 IOA WITH HIGHEST | INTERUPT to, 0= ExXT TEWRCSLL 0 ! = INT 05 04 18 17 i) EBOX IPR STATUS INTERUPT REQUEST | SOURCE 3, 03 2 02 4, 1t 0 00 00 CBUS ADDRESS a5 oM a1, A2 AL (0=RD CSL/1=WR CSL) CBUS WRITE | | This bit determines whether a CPU read or write operation is to be performed over the CBus. The EBox is enabled to drive the CBus data lines when this bit is set. The console 1is enabled to drive the CBus data lines when this bit is reset. <05:00> CBUS ADDRESS <A5:A0> EBE2 CBUS AS5:A0 The address of the console loaded into this register. location K-18 to be | read or written is MACHINE CHECK STACK FRAME BIT DESCRIPTIONS IBESR IBESR (ESCRATCH LOC: 10 - TOD) K} 30 180X ERROR STATUS WORD - (X ww@s DETECTED LI - “, 29 " 13 12 2 10 U-TRAP PRIORITY LEVEL ) MR 1S This register is a combination of the IBox Error register (IBE) and EBox Diagnostic Maintenance register <31> RESERVED <30:29> IBOX AMUX BYTE IN ERROR CODE EBEA IAMUX EC <1:0> LTH (EDMS). Indicates the most significant byte associated AMux parity error. | - <28> Code Byte 00 01 10 11 0 1 2 3 IAMUX SOURCE with the Indicates that the WBus was the input to the IBox AMux when error was detected at the output of the IBox AMux. an RESERVED MODE DETECTED EBEA RSV MODE Indicates addressing LTH that it occurred. <26> IBox (0=GPR/1=WBUS) EBEA IWBUS DATA LTH <27> the mode an operand specifier attempted to use an that is not allowed in the situation in which IBOX BMUX PARITY ERROR PE LTH EBEA IBMUX Indicates that a parity error was detected at the output of the IBox BMux. The input to the BMux was either the IBuffer or the IMD Latch. <25> INST BUFFER PARITY EBEA IBUF PE LTH ERROR Indicates that a parity error was detected on either the OPCode bytes (Byte 0 or Byte 1 in the IBuffer), or on the byte selected by the RMode Finder (IBGPR) during optimization. <24> RLOG PARITY ERROR EBEA RLOG PE LTH Indicates that a parity error was detected while unwinding the RLog. K-19 MACHINE CHECK STACK FRAME BIT DESCRIPTIONS IBESR IBESA IBOX ERROR STATUS WORD (ESCRATCH LOC: 10 ~ T0D) 26 25 24 o 13 23 22 21 ; JuRLe | eMux | BUFFER |3= s | DETECTED | PE PE ERROR CODE 1 27 q | 12 1" 10 | : % PE ///////////A 09 0 UOMEL Indicates 22> IBESR IBOX DRAM EBEA IDRAM 21> <30:29> PARITY Indicates PE (14> ICS EDMS that a parity error was detected on the DRAM data. PE LTH a parity error was detected on the IBox Control REGISTER BITS <D15 D08> EBEA RESERVED ENABLE EBD3 EBOX MICRO TRAP LOGIC EN ETRAP Enables error <13:11> ERROR RESERVED See 15> <28>. LTH Indicates that Store data. 15:08> and IBOX CS PARITY ERROR EBEA 20:16> ERROR LTH that a parity error was detected at the output of the The input to the AMux was either the WBus or a GPR. IBox AMux. See: PE - » PARITY IAMUX *® 4 IBOX AMUX EBEA L s U-TRAP PRIORITY LEVEL AMUX | DRAM PE PE 20 1BOX \ 28 IAMUX ° = 29 IAMUX BYTE IN 30 the EBox microtrap mechanism. MICRO TRAP EBDE UTRP PRIORITY indicates the Priority Microtrap Type W R W SRR RN N R priority D R DD A B N R D G W RN W 0 1 2 EBox Read/Write Microtrap OP Write Microtrap IBox Error Microtrap 3 Misc Microtrap 4 5 6 7 CPU LEVEL request. S in turn allows <2:0> field This N This reporting. Fork Microtrap IMD Read Microtrap ID Read Microtrap STRING Read Microtrap of the last microtrap g MACHINE CHECK STACK FRAME BIT DESCRIPTIONS IBESR 180X ERROR STATUS WORD IBESH (ESCRATCH LOC: 1D - Y0D) i 20 29 27 17 25 26 W AMUX BYTE IN_ ey ERROR CODE gm%% ReseaveD| 1Box | INST o [0zCPR loevecren| 12 13 pe | w10 U-TRAP PRIORITY LEVEL SOURGE 10 1 == |MD 0=1D 2 P OPBUS SOURCE (0=ID/1=IMD) EBD5 SRC IMD LVL3 this When set, This bit is only valid when IBESR <09:08> = 3. When register. IMD from came OPBus the that indicates bit from 1ID came the OPbus data that indicates it cleared, register. <09:08> UOPSEL <1:0> EBDS UOPSEL <1:0> This field indicates the OPBus data source. "Data Source 0 1 2 3 <07:03> RESERVED <02> ‘ESA VALID <01> <00> IBox Register Select Operand Source is EMD Operand Source is IBUFFER Operand Source is IMD or ID Registers (See Bit 10 above) This bit is set by the FHM if the IBox ESA VALID bit is set. ISA VALID This bit is set by the EHM if the IBox ISA VALID bit is set. CPC VALID This bit is set by the EHM if the IBox CPC valld bit is set. K-21 MACHINE CHECK STACK FRAME EBXWD1 EBXWD! (ESCRATCH LGC:”! E - TQE) »31 30 29 28 BIT DESCRIPTIONS ” 27 2 25 EBOX WORD 1 ‘ 24 23 22 21 20 19 18 17 16 15 14 13 12 U 10 09 08 07 08 05 o4 03 02 01 00 LAST WORD THAT THE EBOX SENT TO flEMGVHY TO BE WRITTEN (TOP OF STACK POINTER) 3 A i i | | 3 L 2 i 3 3 1 3 i i i 2 i § 1 § L i 4 3 4 § 1 1 M 13905 <31:00> LAST EBOX WORD WRITTEN This register contains a wrote word to off the the MBox. copy of This Scratch Pad word Stack the is last word obtained (EScratch by that the popping Location EBXWD2 30 29 28 EBOX WORD 2 27 26 25 24 23 22 3 1 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 o4 03 02 01 00 SECOND TO LAST WORD THAT THE EBOX SENT TO MEMORY TO BE WRITTEN : i (TOP OF SCRATCH PAD MINUS 1) 1 3 £ i i i i § 4 ¥ [ 2 i i ] [ |] ] i i i 2 ¥ ¥ i 1 MR <31:00> last FO). (ESCRATCH LOC: 1F — TOF) (31 EBox the 33900 SECOND TO LAST EBOX WORD WRITTEN ~ This register contains a copy of the secon d to last word that the EBox wrote to the MBox. This word is obtained by popping the second to last word off the Scratch Pad Stack (EScratch Location Fl1). | K-22 MACHINE CHECK STACK FRAME BIT DESCRIPTIONS IVASAV VASAY | IBOX VIRTUAL ADDRESS FOR OPERAND (OP PORT) FETCH AND RESULT STORAGE 3 . 1 ] i 3 i i i % i i 3§ d & i 2 - i 1 '] 3 3 3 OO0 OV 02 O3 o4 05 O6 07 08 09 10 n 12 13 14 15 16 7 10 W 2 2 2 2 M 25 2 271 28 29 30 31 VIRTUAL ADBRESS SAVE RESISTER : - (ESCRATCH LOC: 20 — T10) i § 2 i 3 i 3 | M 13007 IVASAV (IBox Virtual Address Save) - This register contains the last virtual address that was calculated by the Address Calculation Unit and acknowledged by the MBox with PA ACK. Therefore, IVASAV will contain the Virtual Address associated with either the current operand fetch cycle, or the current result store cycle. <31:00> Last Virtual Address used by the IBox | VIBASAY (ESCRATCH LOC: 21 — T11) 3130 for Fetch or Result Store. 29 28 | either | 27 2% 25 24 23 2 2 20 19 18 17 16 15 4 13 an Operand VIRTUAL IBUFFER ADDRESS SAVE REGISTER 12 11 10 09 08 07 06 05 04 03 02 01 00 IBUFFER VIRTUAL ADDRESS FOR INSTRUCTION BUFFER (IBUF PORT) FETCHES ¥ % i i 3 3 Y 1 |1 3 5 K i ] ' 1 5 i | ) X | § ] |] W (31:00) IBOX IBUF PORT VIRTUAL ADDRESS This register contains the last IBuffer v1rtua1 was acknowledged by the MBox K-23 (PA ACK). address 1006 that MACHINE CHECK STACK FRAME BIT DESCRIPTIONS 'ESASAV » ESASAYV | | {ESCRATCH LOC: 22 — T12) 31 30 29 28 27 EBOX STARTING ABDRESS SAVE REGISTER : 26 25 24 23 22 21 20 19 8 47 16 15 14 13 12 11 10 09 08 07 06 O5 04 03 02 01 00 VIRTUAL ADDRESS OF INSTRUCTION CURRENTLY BEING PROCESSED BY EBOX 1 2 3 3 i g i L 4 8 3 ] § [ i i 3 3 i 3 1 3 3 3 i 3 1 § 1 1 1 MR <31:00> CURRENT PC FOR EXECUTION UNIT (EBOX) This register contains the address (Macro PC) of instruction that the EBox or FBox is currently processing. ISASAV (ESCRATCH LOC: 23 — T13) 31 30 Y3800 20 28 the IBOX STARTING ADDRESS SAVE REGISTER , 27 6 25 4 23 22 N 20 19 8 7 16 15 14 13 12 11 10 08 08 07 06 05 O4 03 02 OV 00 TM PC OF THE INSTRUCTION THE IBOX ADDRESS CALCULATION UNIT IS CURRENTLY PROCESSING i 3 i § i § i i 2 ' i | ] i i L 1 L ¥ ] i 1 ] i i 2 i 3 1 i i MR <31:00> CURRENT PC FOR ADDRESS CALCULATION UNIT This register contains the address (Macro PC) instruction that the IBox Address Calculation Unit is processing. 1390 of the currently P TM MACHINE CHECK STACK FRAME BIT DESCRIPTIONS CPC CURRENT PROGRAM COUNTER &P:cmm” LOC: 30— T4 31”2’”272‘252‘23222!2019!!1716151&1352”WWMWNMMMWMM PC OF THE INSTRUCTION THE IBUFFER IS CURRENTLY PROCESSING 1 i i ‘ ] § '] i i . )1 i ; 1 ] 1 i ; . l 'l ; i | N| i 3 i 1 § i ¥ i WA <31:00> CURRENT PROGRAM COUNTER 130 the of PC) (Macro This register contains the address process will Unit Calculation Address IBox instruction that the next. | K-25 - MACHINE CHECK STACK FRAME BIT DESCRIPTIONS MSTAT1 MSTATI MBOX STATUS WORD 1 (ESCRATCH LOC. 25 - T15) 31 30 29 28 MBOX DATA DESTINATION CODE 1 | 27 26 25 MBOX CYCLE TYPE 0 3 2 15 14 13 12 T8 MISS BLOCK HIT | CACHE 0 TAG MISS | CACHE MISS . 24 23 22 21 CPR B P gpn A ABUS DATA LONGWORD COUNT. 1 n | TBVALID*| PE 0 A3 A2 10 09 08 E ‘ 07 - 20 PE 06 19 PE 05 04 B2 . B, 1 l 0 i 03 02 01 00 CACHE DATA PE DURING CACHE 1 SELECT PE 80 16 10A SELECT CMD/ADR CYCLE CACHE DATA (BYTE IN ERROR) B 17 ABUS PE CPU WRITE PE TBPTEB | TBPTEA | TBTAG PE PE PE 18 | ABUS CMD | ABU S | OR MASK | ADDRESS : BYTE WRT MR YIS0 This 28 register (byte <31:24> <31:30> 2), is made up of 24 (byte 1) , and SOURCE: MCCJ (CRA) ACCESS: Read Only HELD MCCJ TOT AT: CYC ERR (byte MBOX REGISTERS: 0). 2C (byte 3), Code Destination W N - —— 00 N N A R ORI REG + 2C (MCC STATUS 1) T2 <1:0> DEST CP <1:0> the destination A code associated with the error. SN mmmmmmmmmmmumw”mfl flmmwmflflmm“mmm*w (IBox - 10 IBUF IMD EMD (IBox (EBox - 11 1IBUF (IBox 01 Don't Load Tail Pointer) FETCH/STORE Operand) FETCH/STORE) FETCH MBOX CYCLE TYPE MCCD U CYC TYP <3:0> Indicates the microword - Load cycle Tail type Pointer) a85001ated CODE CYCLE CODE 0000 0001 0010 0011 0100 0101 0110 0111 NOP READ REG WRITE REG WRITEBACK ABUS ARRAY WRT DATA CORRECTION CLEAR CACHE TB PROBE 1000 1001 1010 1011 1100 1101 1110 1111 ABUS was being with the error. ABUS CP REFILL INVAL TB TB CYC CP ARRAY WRT CP WRITE CP READ REFILL LONGWORD COUNT MCCJ WD CNT Indicates was <23:16> MBOX MBOX DESTINATION CODE LAST Indicates <25:24> following 20 MBOX REGISTER 2C (MCC STATUS 1) MCCC <29:26> the <03:02> the longword that detected. MBOX REGISTER 28 (MCC STATUS 3) SOURCE: MBox Register 28 (MCA: (MCC STATUS 3) ACCESS: Read Write HELD AT: MCCM CYC ERR SUM + K-26 TO CRB) processed when the MCCJ (CRB) MBOX REG error 28 MACHINE CHECK STACK FRAME BIT DESCRIPTIONS MSTAT1 MOSK STATNS WOAS | MSTATI ~ (ESCRATCH LOC: 25 - T15) 3 30 DESTINATION CODE 15 | " 0 A, A v MBOX CYCLE TYPE LONGWORD COUNT. 22 gg" %M , BLock | cacHeo | cacee | TBvALID:| TBPTEB | TBPTEA | TBTAG MISS | HIT <23> TAG MISS | MISS | PE PE PE muw ADDARESS| CMOD/ADR 03 o4 05 ,,m!“m'"g"mm i 2 Indicates that ¥ ) wumum 80 i PE parity error was Indicates that a Cycle Parameter RAM' A parity detected. The MBox response is unpredictable. error was detected. a Cycle Parameter The MBox response RAM B is unpredictable. CYCLE PARAMETER RAM A PARITY ERROR MCCC CPR PERR A <21> ) 17 10A SELECT I L | e | CYCLE t .0 l DATA e 8 18 asus CYCLE PARAMETER RAM B PARITY ERROR MCCC CPR PERR B <22> PE 19 asus | 20 21 ABUS | ABus 06 07 08 09 10 " 12 13 23 U 25 26 27 28 29 o0 a o2 MBOX DATA ABUS DATA PARITY ERROR MCD3 ABUS DAT PERR Indicates that a longword parity error was detected on the ABus Data Field during either a CP I/0 READ or DMA WRITE Cycle. <20> ABUS COMMAND OR MASK MCC4 ABUS CNTL PERR PARITY ERROR If bit <18> is set, this bit indicates that a parity error was detected on the command/length field during an ABUS Command Address cycle. If bit <18> is reset then this bit indicates that a parity error was detected on the Mask/Status Field during an ABus Data Cycle. <19> ABUS ADDRESS PARITY ERROR MAP2 ABUS ADR PERR Indicates that a parity error was detected on the Address Field associated with an ABus Command Address Cycle. <18> <17:16> ABUS COMMAND/ADDRESS CYCLE MCCJ ABUS LD CMD Indicates that the MBox was executing cycle when an error was detected. a DMA Address IOA SELECT <01:00> MCC4 ABUS SEL <01:00> Indicates the ABus Adapter that was selected when the error was detected. <15:08> Command MBOX REGISTER 24 (ADDRESS STATUS) SOURCE: MAPP (REG) MBOX REG 24 (ADDR STATUS) ACCESS: HELD AT: Read Self Only holding at T3 K-27 "MACHINE CHECK STACK FRAME BIT DESCRIPTIONS MSTAT1 MSTATI MBOX STATUS WORD 1 (ESCRATCH LOC: 25 - T15) 3 30 29 28 gg&xm%%n CODE 15> 27 26 MBOX CYCLE TYPE | 1.0 3, 2 15 13 12 " 25 10 1 u 23 22 21 20 19 18 cPRS CPRA 32?‘5 ag”&&'.‘(" :ggaess éa%?ma 10A SELECT FETEE T A AT L L L N B 09 07 06 05 LONGWORD COUNT. 10 08 TB MISS TB HIT Indicates that the TB VA the valid of "A" (TB BLOCK HIT 04 1”7 16 03 MAP3 <30:17> or Port <14> Status Indicates 1 Miss) that either the Tag matched PA <28:13>, I/0 adapter was CACHE MAP3 was not set. back to the The did not match MBox will send a EBox. at for Cache least one 0 or valid the Tag bit was for set, Cache and an not selected. 0 TAG MISS CO0 TAG MAT Indicates <12> (indexed by VA <16:09>) bit BLOCK HIT MAPL BUF <13> TAG that Tag in Cache 0 did not match PA <28:13>. CACHE MISS MAPL BUF CACHE HIT | Indicates that either PA <28:13> failed to match both the Cache 0 Tag and the Cache 1 Tag or, if a match did occur then the valid bit for the target word(s) was not set. <11> TB VALID PARITY ERROR MAPP TB VAL ERR Indicates that a parity error was detected on the valid when the TB was read. The MBox will send a Port Status of (TB Error) back to the EBox. | <10> TB PTE B PARITY ERROR MAP3 PTE B PAR ERR Indicates that a parity containing PA <24:09> Status of "8" (TB Error) <09> <08> bit "8" error was detected when the TB PTE RAM was read. The MBox will send a Port back to the EBox. TB PTE A PARITY ERROR MAP4 PTE A PAR ERR Indicates that a parity error was detected when the TB PTE RAM containing PA <29:25>, PROTECTION <D:A>, and MODIFY Bits were read. The MBox will send a Port Status of "8" (TB Error) back to the EBox. TB TAG PARITY ERROR MAP3 TB TAG PAR ERR Indicates that a parity were back read. to the The MBox error was detected when will send EBox. K-28 a Port Status of the "8" TB TAG RAMs (TB Error) MACHINE CHECK STACK FRAME BIT DESCRIPTIONS MSTAT1 MBO0X STATUS WoRD | BSTATI (ESCRATCH LOC: 25 - T15) 30 3 2 MBOX DATA DESTINATION CODE t 15 25 24 LONGWORD COUNT. 0 3 " 13 . 2 .1, 0 A, A 12 11 10 09 08 23 22 07 ; TAGMISS | MISS B O 06 , L 05 PU WRITE PE N Eamom 8 77 , 8 A 04 ' 03 B 02 01 00 GATREHE | cace s v // PE | 7] CACHE mewm' - <07:00> MCDU REG 20 (DATA STATUS) Read Only MCCJ TOT CYC ERR + T3 CPU WRITE (BYTE IN ERROR) UFO5 WR BYT <3:0> PERR (MCDU) This data this <03> 2000 MCDU REGISTER 20 SOURCE: ACCESS: HELD AT: <07:04> l 1,0 7 |, 6 10A SELECT A ABUS DATA OR8UMASK | ADDRESS EaivaDn | | | Bock | [caceo | cacwe | Tavauc | TaetEs [ TBeTEA | TBTAG MISS | WIT 18 CPRB || (PR field indicates that a parity error was detected on the received from the CPU during a CPU write. Furthermore, field indicates which byte(s) had the parity error. CACHE DATA PARITY ERROR UFO5 CACHE BYT PERR (MCDU) Indicates that a byte parity error was detected on data read from cache during any Cache operation other than a CPU Byte Write Hit. 1Includes: CPU Read, DMA Read, Writeback, and DMA Masked Write. <02> <01> CACHE 1 SELECT MAPL CACHE 1 DAT This bit indicates was detected. the Cache the error ARRAY READ MCCJ ANY REFILL Indicates that a Cache detected. <00> | that was selected when Refill was in progress when an error was | CACHE DATA PARITY ERROR DURING BYTE WRITE UFO5 CACHE BWRT PERR (MCDU) Indicates that a Cache Data Parity Error was CP Byte-write Operation. K-29 detected during a MACHINE CHECK STACK FRAME BIT MSTAT?2 DESCRIPTIONS | MSTATZ MBOX STATUS WORD 2 (ESCRATCH LOC: 26 - T16) 3 7 ‘ 30 7 ‘ 15 | 29 28 27 26 7000008 ARRAY T4 M w13 12 1 09 23 | 08 3 2 1 . ‘ 07 CMD/MASK FIELD o0 2 V. DIAGNOSTIC — ABUS gyre_ | ABUS | LEN/STAT FIELD S | BAD DATA CODE 2 ARRAY TYPE CODE 10 DIAGNOSTIC — ABUS | 25 2 7 20 7, 05 PE 04 PE 17 16 | 03 muLTipLE || Tag SACME| | wem CACHE || wer CACHE | . MULTIPLE 0 18 PAMM CONFIGURATION CODE /| 06 19 SET 02 o1 , 00 o | SO0 | Asus // E ERROR | 0k 7 AR - YRR This register is made up from the 2), 5C (byte 1), and 58 (bytée 0). <31:24> VMS SUPPLIED This field is selected at supplied the time following by VMS that the and describes occurred. RESERVED <27> ARRAY TYPE CODE VALID Indicates that the Array Type Code ARRAY TYPE CODE Indicates the Array type that occurred. was bits CODE ARRAY 000 001 Reserved 16 Mb Array 04 Mb Array 64 Mb Array 100 101 Reserved Reserved 110 111 Reserved Reserved RESERVED <20:16> MBOX REGISTER 54 SOURCE: MAP9 (RAM) MBOX REG 54 K-30 (byte array type is vaild. the error (PAMM) Normally this field will indicate the Array Adapter selected when the error was detected. error involved a CP to I/0 transfer, then this valid. when 54 TYPE PAMM CONFIGURATION CODE <A:1> MAP9 PAMM CONF <A:1> The Error Handling Microcode uses MEAR <29:20> PAMM and then 1loads the five-bit PAMM code be the <26:24> selected ARRAY TYPE <23:21> <20:16> in CODE 010 0l1 Registers: error <31:28> <26:24> MBox to address the into this field. Module or I1f, however, field will I/0 the not MACHINE CHECK STACK FRAME BIT DESCRIPTIONS MSTAT?2 NSTAT2 MBOK STATUS WORD 2 26~ T16 (ESCRATCH LOC: 26 - T16) 06 07 08 09 04 05 03 16 02 SELECTS 00 Array Slot 0 18 I/0 Adapter 0 04 05 06 07 Array Array Array Array 4 5 6 7 1C 1D 1E 1F Reserved Reserved Reserved Non-Existent Address Array Slot 1 Array Slot 2 Array Slot 3 Slot Slot Slot Slot 19 1A 1B 01 00 , CODE 17 17 18 AGNOSTIC — ABUS ABUS SELECTS 08 - : 19 CODE 01 02 03 S o e 20 21 , g‘mf&s{meaf o MuLTIPLE iggut %«;‘:;15 i’gfi — g&ggg. %:u:m % l U %‘gfizom Eé*u‘?‘é‘?%“i‘im‘;““ — | 10 1" 1S | 22 23 24 25 26 27 3N I/0 Adapter 1 I/0 Adapter 2 (Not used) I/0 Adapter 3 (Not used) Reserved the use If a 16 MB Array was selected (MSTAT2 <26:24> = 001) to determine which bit is in error. (MDECC <14:09>) syndrome to <25:24>) (MSTATl1 Use MEAR <23:22> and the Longword Count determine the failing longword. Now, identify the failing SMU If check bit CO is at fault, MEAR from the following chart. <22> determines which of the two rows is to be u3sed. FAILING LONGWORD DATA BIT IN ERROR wd 0 wd 1 wda 2 wd 3 o - 15, C2, C4, C5 SMU1 SMU2 SMU3 SMU5 CO (if MEAR <22> = 0) CO (if MEAR <22> = 1) ————— e fmm———— P —— S S S —— + the use 011) If a 64 MB array was selected (MSTAT2 <26:24> = longword count (MSTAT1 <25:24>) to determine the failing SMU according to the following chart. MSTAT1 <25:24> FAILING DAUGHTERBOARD 00 01 SMU1 2 SMU 11 SMU4 10 SMU3 MACHINE CHECK MSTAT2 STACK FRAME BIT DESCRIPTIONS | MSTAT2 (ESCRATCH LOC: 26 - T16) MBOX STATUS WoORD 2 27 26 25 24 23 22 20 21 19 18 17 16 PAMM CONFIGURATION CODE ARRAY TYPE COOE 4 1" BYTE WRITE <15:08> ABUS BAD DATA CODE MBOX I REGISTER ABUS 07 05 04 CACHE CACHE PE SET 06 MULTIPLE | 1ag" |ERROR | g 03 02 WeIT CP NXM CP 110 BUFFER STATUS 2) WBIT ERROR 5C MCCJ (CRA) MBOX REG 5C (MCC | CP BYTE WRITE Indicates <14> o P BYTE WRITE MCC7 08 CMO/MASK FIELD T 1 09 DIAGNOSTIC — ABUS LEN/STAT FIELD SOURCE <15> 10 iAGNOSTtG A805 BAD that the MBox was servicing a byte write request. DATA CODE MCC4 <13:12> <11:08> <07:00> <07> AB BAD DAT ERR Indicates that the CP | IO read Adapter was bad data marked as data via the received from the ABus Status Field. DIAGNOSTIC ABUS LENGTH/STATUS FIELD MCC4 LABUS STAT <1:0> Indicates the state of the ABus Length an I/0 diagnostic operation. DIAGNOSTIC ABUS COMMAND/MASK FIELD MCC4 LABUS MSK <3:0> Indicates the state of the ABus Command an I/0 diagnostic operation. MBOX REGISTER SOURCE: MCCJ MULTIPLE and and ABus Status lines during Mask lines during 58 (CRB) MBOX REG 58 (MCC STATUS 4) ERROR CRB5 MULT ERR (MCCJ) Indicates that a second MBox error was detected before the EBox could read and clear the first error. Most or all of the information associated with the second error will be lost. <06> <05> CACHE TAG PARITY ERROR CRB5 CACHE TAG PERR (MCCJ) Indicates that a parity error was detected valid bit portion of the Cache tag. in the address CACHE WRITTEN BIT PARITY ERROR TAG WRT PERR (MCCJ) Indicates that a parity error was detected in the Written Bit Field. The MBox handles the request as and CRB5 written bit was a one. Cache though tag the MACHINE CHECK STACK FRAME BIT DESCRIPTIONS MSTAT2 MBOX STATUS WORB 2 NSTAT2 (ESCRATCH LOC: 2€ - T16) 26 27 ; g%* 1““ <04> S DIAGNOSTIC — ggg ATA | LEN/STAT FIELD CODE ' ¥ 0 3 25 oot o B ‘ DIAGNOSTIC — ABUS CMD/MASK i FIELD 2 . 20 A 1, 0 19 18 17 16 v ':"“ C”““‘“f“‘“i’" W:E ! MULTIPLE ERROR Pt 4 | CP 0 CPNXM | BUFFER ERROR CACHE WRITTEN BIT SET CRB5 WRITTEN (MCCJ) Indicates that the cache written bit was set when an error was detected. <03> NON EXISTENT MEMORY CRB5 NXM (MCCJ) | - Indicates that either the memory request was to Non-Existent Memory or an I1/0 adapter attempted to address another I1/0 adapter. An MBox Fatal Error status code is sent to the EBox for CP initiated requests, and the DMA ERROR line is asserted to the I/0 Adapter for I/0 initiated requests. All writes are cancelled. <02> CP I/0 BUFFER ERROR CRB5 CP IO BUF ERR (MCCJ) Indicates that the selected ABus request. CPU a processing while <01> Adapter detected an error ABUS MEMORY LOCK CRBS LOCK LINE (MCCJ) when set, indicates that the Memory Lock Line was being driven on the ABus. This line may be driven by either the MBox or an ABus Adapter. <00> RESERVED K-33 MACHINE CHECK STACK FRAME MDECC BIT DESCRIPTIONS MDECC MBOX DATA ECC WORD (ESCRATCH LOC: 27 — T17) IPR: 43 27 EWERT BUS ONGWORD ARITY ECC DATA CORRECTION SYNDROME 32 16 i 08 i 4 b . 2 22 07 06 21 . 0 c , 19 05 04 03 .DIAGNOSTIC ECC CHECK BIT INVERT '}’/// 02 20 C6 c5 i 4 i €3 18 17 02 01 16 00 INVERT {ARRY BUS . C Ct LONGWORD | IpARITY MR- 13822 MDECC (MBox three MBox Data ECC <31:24> Reserved <23:16> Source: Held at: <23> <22> Status registers; MCDM ECC4 70 Register) (byte 2), - 60 This (byte (ECC) MBox Reg 70 ERR HLD and T2D register 1), (DATA and 50 ECC is made (byte 0). up of ERROR) | Reserved BAD DATA SYNDROME DETECTED BD ERR (MCDM) | | The bad data bit is XORed into the ECC check bit generation whenever "known" bad data is written into either the cache or the array. A read to that 1location will result in the detection of a Bad Data code and cause this bit to be set. Note that to read check bits from cache a cache byte parity ECC4 error <21> must have BIT ERROR SINGLE ECC4 SB been found to ERR (MCDM) data word read from cache (ECC correctable) error. The <20> the check bit read. | or main memory had | a single bit or main memory a double bit DOUBLE ECC4 The BIT ERROR DB ERR (MCDM) data word read error <19> invoke or a from cache detectable multiple bit had error. ADDRESS PE ECC4 AP ERR (MCDM) The word fetched from memory was either written or read from the wrong location. The ECC code indicates that the parity of the address written and the parity of the address read from are different. ECC 1is generated over the data bits and a parity bit computed over the physical address bits <29:04>, <18:16> 110 ECC2 DATA <33:32> (MCDM) These bits should always return as a this field position can be used location of MDECC in a Stack Frame. K-34 binary 110. Therefore, as a key to confirm the MACHINE CHECK STACK FRAME BIT DESCRIPTIONS MDECC MDECC MBOX DATA ECC WORD (ESCRATCH LOC: 27 — T17) IPR: 43 30 | 7/ ‘ NERT GW | 29 ' <15> 2 2 23 22 09 08 07 06 21 20 19 05 04 03 pousie | “V‘ | PE | ETECTEfl eRrOR |' ERroR iz ‘ Tz /4 00 | ECC DATA CORRECTION SYNDROME ARy | s . 16 <15:08> 27 28 24 08 Source: Held at: 04, 02 0 @, MCDM (ECC) MBox Reg 60 ECC4 | | | 17 02 01 oo L e o "DIAGNOSTIG ECC CHECK BIT INVERT 65 oo 18 6 00 NVERT nl &m jan (DATA SYNDROME) ERR HLD INVERT ABUS LONGWORD PARITY ECC3 LWP INVERT REG (MCDM) When this bit is set, and bit <01> in Register 10 (Data Control 2) is set the ECC MCA will generate bad ABus longword parity. <14:09> ECC DATA CORRECTION SYNDROME ECC3 SYN REG <32:1> (MCDM) ECC3 SYN REG <32:1> syndrome generated bit number. (MCDM) This field corresponds to the by the ECC chip and indicates the failing SYNDROME IN OCTAL (MSB) (LSB) 70 07 0421000 66666655555444443333322222111111 0000421 65432132654654326543265432654321 DATA BIT (MSB) BA CCCCCCC 33222222222211111111110000000000 (LSB) DP 0654321 10987654321098765432109876543210 <08> Reserved <07:00> Source: <07:01> DIAGNOSTIC CHECK BIT INVERT <CP:Cl> ECC3 <CP:Cl> INVERT REG (MCDM) When set the corresponding ECC check bits will be <00> MCDM (ECC) MBox Reg 50 (DATA CHECK INVERT) inverted. INVERT ARRY BUS LONGWORD PARITY ECC3 LWP INVERT REG (MCDM) When this bit is set, and bit <01> in Register 10 (Data Control 2) is set, then the longword parity generated on ARRAY BUS data will be inverted when the ECC MCA is the check mode. This bit is write only and is read via bit <15> in MDECC (MBox Register 60). » K-35 MACHINE CHECK STACK FRAME BIT DESCRIPTIONS MERG MERG MBOX ERROR GENERATION WORD (ESCRATCH LOC: 28 — T18) IPR 47 05 mman' CMO/MASK BADABUS k’%g figfi?“" neaues*rs geporTsSpe || 'N1040146 PROG | ENABLE Gmeme 04 03 2 01 00 GENERATE EVEN PARITY 78VIADIAG RAMS | TBTAG: , TBVALD, WBIT CACHE . TAG CACHE . ADDRESS PHISICAL % MR- 13023 MERG (MBox Error Generation Register) - This registers 18 (byte 1) and 14 (byte 0). <31:16> register is made up of MBox Reserved mm_mmmmmmmmMMmwmmwM‘wMmm-m mmm“mm““”MMWWM“““%““M“”““” <15:08> Source: Note: MCCJ This (CRB) field MBox is WWM”M* Reg set 18 by (MCC CONTROL loading CRB4. <15:13> <12> <11> Control Register on Reserved INHIBIT DMA REQUESTS DMA REQ MCCJ INH When set, GENERATE MCCJ the Mbox will BAD ABUS CMD BAD not CMD/MASK honor any DMA requests. PARITY PAR When set, ABus Command/Length transferred with even parity. <10> the 3) and ABus Mask/Status will be INHIBIT ARRY SBE REPORTS INH ARY CORR REP When set, this bit prevents the MBox correctable errors. The errors will usual. MCCJ <09> IOA DIAG MCCJ IN from still reporting ECC be corrected as PROG IOA DIAG IN PROG When set, the Abus command/mask field and length/status are driven from MCC CONTROL REGISTER 1C <03:00>. <08> MEMORY MGMT ENABLE MCCJ MEM MAN When set, D RN O IR D A R D Source: <07:06> Reserved | is enabled. translated by the TB. When will be treated as physical checking will <07:00> EN memory management are then references | lines be disabled. All virtual references this bit is cleared, all and all TB parity error e “w““”‘”n“mmw““““u““*"-”“"wm “““mm“mn”w”mfl"““w“““ MAPP (REG) MBox Reg 14 (ADDR CONTROL 2) K-36 MACHINE CHECK STACK FRAME BIT DESCRIPTIONS MERG MBOX ERROR GENERATION WORD MERG (ESCRATCH LGC 28 — T18) IPR: 47 22 ////////////////////////////// ,/////////////////////////////////////// 00 , 13 “50“53’3 %}Ws“ Reromns | 'WPROS | ENdaLe e A , ‘ - <05> ADDRESS TB RAM VIA DIAG MAPP ADR RAM DIAG VIADIAG | T TAG , TBVALD, WBIT . TAG , ADDRESS | CACHE CACHE PHISICAL L LB k] | The effect of this bit depends on the cycle type: Write TB - Causes the TB RAMs containing physical address bits <12:09> and selected by the IVA lines to be written instead of those selected by the EVA lines. Used for diagnostic testing of TB RAMs. Read TB Causes the <contents of the TB RAMs containing physical address bits <12:09> and selected by the IVA to be read instead of those selected by the EVA 1lines. Used for diagnostic testing of TB RAMs. Diagnostic Read Cache Tag Address Causes the written Dbit, written parity valid bits and cache tag parity to be read in place of the cache tag address bits. Used for diagnostic testing of <04:00> <04> GENERATE the EVEN cache PARITY When set, the bits generate even (bad) tag RAMs. GENERATE EVEN PARITY TB TAG MCCJ EVN TB TAG PAR When set, a write to the TB will written with even parity. A result in a TB TAG parity error. <03> GENERATE EVEN « in this field will cause parity for the corresponding the MBox function. to result in the TB tag being read to this TB location will PARITY TB VALID MAPP GEN TB VAL ERR . When set, a write to the TB will result in the TB Valid bit field being written with even parity. A read to this TB location will result in a TB Valid parity error. <02> GENERATE EVEN PARITY CACHE WBIT MAPP GEN EVN WBIT PAR When set, parity for the cache written bit will before being location will be complemented written in the cache. A read to this Cache result in a Cache WBit parity error. MACHINE CHECK STACK FRAME MERG BIT DESCRIPTIONS MERG MBOX ERROR GENERATION WORD (ESCRATCH LOC: 28 — T18) IPR: 47 /////////////////////////////////////////////////////////////% 12 neuuasr i E s CMOMASK eports | TMNPROG | ENaBLE INHIBIT GENEMTE 1aanagus | INHIBIT | 104 DIAG ARRY SBE | A2 ggggm / ' %g%;fig GENEMTE EVEI% mm VIADIAG | TBTAG , TBVALD, WBT . TAG . ADDRESS CACHE CACHE PHISICAL MR -1 35923 - <01> - <00> GENERATE EVEN MAPP EVN GEN When set, during a result in PARITY CACHE TAG PAR TAG the cache data address tag is stored with even parity cache write. A read to this Cache location will a Cache Tag parity error. GENERATE EVEN PARITY PHYSICAL ADDRESS EVN PA PAR The effect of this bit depends on the GEN MBox Cache is type performing. or Array Writes generated as complemented. part If of the When the write set, ECC was to the of address character the operation parity generation array this will the bit be condition will be detected when that array location is next read. If the write was to the Cache a Cache Byte parlty error will have to be forced by some other means in order to detect this condition. DMA - The MBox will detect an ABus Address CP to I/0 Transfers - The ABus Adapter will parity and set CP I1/0 Buffer Error. TB Writes - Bad parity will be forced K-38 Parity Error. detect bad address in both PTE A and PTE B. MACHINE CHECK STACK FRAME BIT DESCRIPTIONS CSHCTL SHCTL CACHE CONTROL 25 \ (ESCRATCH LOC: 29 - T19) 16 05 03 02 FORCE FORCE 01 MISS HIT 00 CACHE 1 | CACHE O MA - 13924 This is made up of MBOX Register 04 (byte (31:08) RESERVED <07:00> SOURCE: <07:04> RESERVED <03> FORCE CACHE MISS REG4 FORCE CACHE MISS (MAPP) When set, forces a cache miss (overrides CACHE 0 ENABLE). This bit is used for during diagnostics. 0). MAPP (REG) - MBOX REG 04 (ADDR CONTROL 1) CACHE 1 ENABLE and forcing a cache refill Cl1 CO Hit En En Function Cache Miss Cache 0 determines Hit or Cache 1 determines Hit or Both Caches determine Hit Cache Miss Force Hit in Cache 0 Force Hit in Cache 1 Illegal, gives tag parity with Written Bit =1 The —_ OO Force _O OO MO 1 OO FORCE CACHE HIT REG4 FORCE CYC (MAPP) This bit works in conjunction with CACHE ENABLE as shown in the following table: 00O <02> -0 - register three bits <02:00> reflect the ENABLE and CACHE O Miss Miss or Miss error following: Code 000 - Cache 0 and 1 Disabled Code 001 - Cache 1 Disabled Code 010 - Cache 0 Disabled Code 011 - Normal Running Code. Code 100 - Cache 0 and 1 Disabled Code 101 - Used to force a sweep of Cache 0 if Written Blt is ; set. Also used for diagnostic purposes to force a hit in a given cache. K-39 MACHINE CHECK STACK FRAME BIT CSHCTL DESCRIPTIONS CSHETL CACHE CONTROL (ESCRATCH LOC: 29 - T19) /////////////////////////////////////////////////////////// ///////// MB-13934 Code 110 - Used set. Code ~ 111 - to force Also a used sweep of Cache 1 for purposes diagnostic if Written to Bit is force a hit in a given cache. This is an illegal code that covers the case of a hit in both caches. A tag parity error with w=1 is also forced if, during normal operation, both caches give indications of a hit. This 1is considered an MBOX FATAL ERROR. <01> ENABLE CACHE 1 REG4 CACH 1 ON When set, written). <00> (MAPP) enables Cache ENABLE CACHE 0 REG4 CACH 0 ON (MAPP) When set enables Cache 1 (allows Cache 1 to be read and 0 to be read and | 0 (allows written). K-40 Cache MACHINE CHECK STACK FRAME BIT DESCRIPTIONS MEAR MEMORY ERROR ADDRESS REGISTER MEAR (ESCRATCH LOC: 2A - T1A) 2 ! i ¥ 1 £ i 13 12 17 16 1 14 19 18 20 22 21 31 30 29 28 27 26 25 24 23 11 10 09 08 07 06 05 04 03 02 01 00 PHYSICAL ADDRESS <29:02> IN PA LATCH WHEN MBOX ERROR§WASV DETECTED i i i i 2 3 1 § i i 3 i i 4 i i ¥ i § %| ' MR- 1387 It This is a copy of MAPl1/3 (ADA/ADB) MBOX REG 7C (ERROR ADDR). when Mux PA the of output t the contains the physical address presenat the MBox detected an error. Interpreting this address MBox Data (MSTAT1 1. Destination <29:26>). CP initiated contain the Code (MSTAT1 depends on the <31:30>) and the MBox Cycle Type Read/Write Operations - MEAR will (Non 1I/0) address of the longword that was associated with the error unless: CP REFILL or WRITEBACK - (The CP request caused either a Cache Refill Cycle, or a Cache Writeback Cycle). Then MSTATI <25:24> (Longword Count <A3:A2>) must be wused to determine the address of the longword that was associated with the If MSTAT]1 <25:24> equal MEAR <03:02> then MEAR error. contains the address of the longword associated with the error. Otherwise, you must substitute MSTAT <25:24> to determine the address. MEAR <03:02> with 2. CP Initiated I/0 Read/Write Cycles - MEAR is not valid for 3. ABus Initiated Read/Write Cycles =~ MEAR will contain the starting address of the data to be processed (i.e., longword, any CP initiated I/O Cycles. Count (Longword <25:24> MSTAT1 or octaword). quadword, longword the of address the determine <A3:A2>) must be used to <25:24> equal If MSTAT1 that was associated with the error. the longword of address the contains MEAR then MEAR <03:02> MEAR substitute must you Otherwise, error. the with associated address. the determine to <25:24> MSTAT with <03:02> K-41 MACHINE CHECK STACK FRAME BIT DESCRIPTIONS MEDR MEDR MEMORY ERROR DATA REGISTER (ESCRATCH LOC 28 — T18) wmwzszazrmzsuzazzmmwwwwwu wwntomwwu&osmaaaz,mm DATA WORD BEING PROCESSED BY MBOX WHEN THE MBOX ERROR WAS DETECTED SR, W 3 ;T D T T T T 3 [ T i WORTENR T | [ { L2 .3 1 ' T T MR 13§ This is a copy of MCD1/3 (MDP) MBOX REG 78 (DATA ERROR). It is only valid (held) for ABus Parity Errors (DMA Data PEs and CPU Read PEs only), ABus BDC Errors, and CPU Write PEs. It contains the data word latched in the MCD MCAs when the error occurred. MACHINE CHECK STACK FRAME BIT DESCRIPTIONS FBXERR | FBOX ERROR REGISTER FBXERR - T1C) (E&OMTGH LOC: 2C FPX Resm.f FORMAT CODE 1 21 23 24 7 i rou 3 EXPONENT mmmw T to, 0 msmucrmu R 1 25 28 / 1 6 0 , 7 , 7 (PIIIIIY, : 5 08 B NORM et 07 ; | ; , 06 ; 7 /. 05 , 000 19 20 o | 16 17 18 seus roox | seseaveo | 02 IF SELF 01 grRoR| pe | OPERAND PE 03 : 04 : , | TEST ERR 0="FBA 1 = FBM |/ 1/ 00 FBOX / roBLEM Mflw‘“fi reading three the exponent an extension of 1is This field, which indicates overflow and underflow conditions as follows: field, The EHM builds this register in the EBox Scratch Pads by 03, FBox registers: and Ol. 02, <31:26> RESERVED <25:24> EXPONENT EXTENSION <01:00> FBRD EXT <1:0> (FAO1) FIELD CONDITION 00 01 10 11 Normal Overflow ~ Underflow Underflow <23:22> RESERVED <21> FBM CONTROL STORE PARITY ERROR MPZ5 CS PAR ERR (FMO07) Store Parity Set when the FBM module detected an FBM Control When set, the CS Address is latched in the FBM MSQ MCA, Error. <20> FBA CONTROL STORE PARITY ERROR ACC4 RAM PERR (FAl3) Set when the FBA module detected an FBA Control Store Parity When set, the Control Store Address is latched in the Error. FBA MSQ MCA. <19> FBOX DRAM PARITY ERROR MCB3 FDRAM PAR ERR (FM1l1l) Set when the FBM module detected a Dispatch RAM Parity Neither the DRAM address nor the DRAM data are latched. <18> ~ SELF Error. TEST ERROR FBR3 SELF TEST ERROR (FAO01l) Set when the FBox detected an error while running the Self Test. Bit <02> will indicate whether the error was associated with <17> the FBA or FBM Module. FBOX GENERAL PURPOSE REGISTER PARITY ERROR FBR4 GPR ERROR (FAO01l) Set when the FBA Module detected a GPR Parity specific byte in error cannot be identified. K-43 Error. The | "MACHINE CHECK STACK FRAME BIT DESCRIPTIONS FBXERR FEXERR FBOX ERROR REGISTER (ESCMTCH LOC: 2C-T1() ‘ 27 25 24 22 21 . 20 19 18 17 - TEST GPR ERROR 16 PE MR- 13925 <16> <15:14> RESERVED OPERAND FBR6 RESERVED OP (FAO01l) This bit generally indicates a software problem significance in the context of a hardware error. that one of the operands received by the FBox had sign and an exponent of zero. INSTRUCTION FORMAT CODE <01:00> FBR9 FORM <01:00> (FAO01) This field indicates the format of FBox was executing register. CODE 00 01 10 <13:12> the FBox instruction status was <02> the in this field <06:05> (FAQ01l) has no significance an error. It <5:6> exponent result bits in the context of <13:12>. DENORM RESULT FBR1 DENORM (FAO01l) | ' This bit has no significance in the context of indicates that a rounding operation resulted in a <07:03> that saved F G D represents <08> the FORMAT FPX RESULT FXP6 FXRES This when and has no It indicates a negative errors. It "carry out". RESERVED IF SELF TEST ERROR (0=FA/FM=1) FBR4 WBUS D00 (FAO1) This bit has a double meaning. During normal operation this bit indicates that the divisor was equal to zero. 1If SELF TEST ERROR is detected <01> set, the however, Self Test this error RESERVED K-44 bit indicates (OwFA/FMwl). which FBox module MACHINE CHECK STACK FRAME BIT DESCRIPTIONS | FBXERR FBXERR v FBOX ERROR REGISTER | - c) n mscmmu LOC: 2C 21 20 19 18 17 16 MR- 33925 <00> FBOX PROBLEM FBR1 FBOX PROBLEM (FAQ1l) Set when the FBox detected one of the following conditions: Exponent Extension Problem ....... GPR Parity Error eceeesecceeseesss FBM Control Store Parity Error ... FBA Control Store Parity Error ... FDRAM Parity EFrOr .ceceeessssssse Self Test Error eccececeecscseccssesssss Divide DYy Z€r0O cieceeccsocscsceses Denormalize ResuUlt cecceeeeesceecess Reserved Operand ..cecceeesscsesses Bits Bit Bit Bit Bit Bit Bit Bit Bit <25:24> <17> <K21> <20> <19> (18> <02> <08> <K16> MACHINE CHECK STACK FRAME BIT CSES DESCRIPTIONS CSES CONSOLE CONTROL STORE ERROR STATUS WoRd 'ESCRATCH LOC 20-T10: 3 3 29 ECCCORRE FALED " 4 pr o ‘5 28 SR JconsOLE | 27 | '3 28 24 23 12 1t 0 8 08 07 " 8 69 28 07 SYNDRQOME USED T0 CORRECT CS CR DRAM 81T L 22 21 20 19 18 17 18 ADORESS OF CS QR DRAM PARITY ERROR oo K 2 S O I R - , , % 08 ;,%%/ 7 /7 7 G | 05 M 05 04 03 - 02 , 02 o »*’Lf:?”/%///{:;g% & 4 .y 01 00 CS/DRAM 10 s 2 | 0 ‘ The contents of this register Control register Store and Dispatch to CPU as <31> CORRECTION the Indicates Store <28:16> <15:08> of generated ECC the by correction. ECC We the console The correction console process. N during sends this FAILURE that or the console Dispatch <02:00>. <30:29> part are RAM 00 RAM was unable identified to correct in the the CS/DRAM Control ID field RESERVED CONTROL STORE ADDRESS Address console DRAM Error. the Parity CONTROL This is used when correcting the Control the SYNDROME syndrome the console used to correct bit 1in EBox to the | <07:03> RESERVED <02:00> CONTROL STORE/DRAM IDENTIFICATION CODE This is the code passed to the console from determine what Control Store to apply correction NV WN O or STORE error. CODE Store the to. MEANING No Error EBox Control Store Parity IBox Error Control Store Parity IBox Error DRAM Parity Error FBox DRAM Parity Error FBox Adder FBox Multiplier MBox Control Module Micro-stack) Control Module Store Store Control Errors Parity Store (Data, Error Parity Address, or Error i, MACHINE CHECK STACK FRAME BIT DESCRIPTIONS PC PROGRAM COUNTER - (ESCRATCH LOC: 2E — T1E) 31 30 29 28 27 26 25 24 23 22 2 20 19 18 17 16 15 14 13 12 W WWMM%N%NMMNM RETURN PC FOR REI (CALCULATED BY EHM) i 3 i 4 i i » i § 3 xk 3 1 1 ’l ) F 3 ‘_____j 1 i ] i & A 3 ¥ 1] i ] MR T34 This register contains the PC to be used by VMS if it determines that an REI is possible. The contents of this register are determined by the "Error Handling Microcode. Depending on the instruction in the pipeline that was associated with the error, this register will reflect the CPC, the ISA, or the ESA. MACHINE CHECK STACK FRAME BIT DESCRIPTIONS PSL PROCESSOR STATUS LONGWORD (ESCRATCH LOC: 2F - T1F) K} 30 29 28 TV | PENDING|| . COMPAT ABILI TRACE 26 25 INTERUPT FIRST | MODE 14 27 % PARE | STACK s g2 13 12 (] 10 24 23 CURRENT 22 ACCESS MODE ACCESS MODE 09 07 ! i 21 PREVIOUS 0 ! 1 20 19 18 17 16 03 02 01 00 V/ 0 % 06 4 05 L 04 MR -YI927 <31> <30> COMPATIBILITY MODE When set, indicates that compatibility mode. When 'is in native (VAX) mode. TRACE PENDING This bit works the processor indicates clear, in PDP-11 the processor » in conjunction with Enable is that Enable Trace bit <04>. If Trace is set at the beginning of an instruction then the processor will automatically set this bit and thus cause a trap to the Trace Handler Routine. The Trace about the and Trace Handler Routine will gather state of the system and REI Pending instruction to be bit to discontinue tracing Trace Pending). and <29:28> "RESERVED <27> "FIRST PART DONE alone). traced. This When the it will the desired (leaving the will information Enable Trace allow the next Trace Handler Routine wants clear both bits (Enable Trace This bit is set by long instructions that can be interrupted during execution (e.g., Move String). The REI following the interrupt will continue the interrupted instruction. <26> INTERRUPT STACK When set, the processor is executing on the interrupt stack. Any mechanism that sets this bit also clears current mode and raises the IPL above 0. If an REI attempts to restore a PSL with 1IS=1 operand and fault non-zero is taken. is executing on bootstrap time, IS <25:24> CURRENT ACCESS <23:22> PREVIOUS This and <21> USER ACCESS MODE RESERVED by bit is loaded from instructions. REI's. access 0 | the It is specified MODE - field CHMx restored 3 this the stack is set. This field indicates the process, as follows: SUPERVISOR, current mode or When mode of KERNEL, zero IPL, clear, the by a reserved processor current mode. At | the currently executing 1 EXECUTIVE, 2 - Current Access Mode by exceptions 1is <cleared by interrupts, and ' MACHINE CHECK STACK FRAME BIT DESCRIPTIONS PSL PROCESSOR STATUS LONGWORD PSL (ESCRATCH LOC: 2F - T1F) k) i COMPAT ABILITY MODE FIRST PART TRACE FENQWG 14 13 12 | INTERUPT | STACK 23 24 25 26 27 30 CURRENT 21 20 19 05 04 03 18 17 16 02 01 00 ZERO Z OVERFLOW VvV ACCESS MODE ACCESS MODE 1 22 PREVIOUS L0 H to 0 07 06 TRAP ENABLES . DECIMAL FLOATING INTEGER OVERFLOW UNDRFLOW OVERFLOW NEGATIVE TRACE | N CONDITION CODES CARRY C WA <20:16> INTERRUPT PRIORITY LEVEL Indicates the current processor priority, in the range 0 to 1F The processor will accept interrupts only on levels (Hex). At bootstrap time, IPL is greater than the current level. initialized to 1F <15:08> <07> IR (Hex). RESERVED DECIMAL OVERFLOW TRAP ENABLE When set, it forces a decimal overflow trap after execution of an instruction that had a conversion error, or produced a no result with a decimal overflow. When this bit 1is clear, trap will occur, however, the condition code V bit will still set. <06> FLOATING UNDERFLOW EXCEPTION ENABLE | wWwhen this bit is set, it forces a floating underflow exception n the instruction that produced a result with after executioof an underflow (e.g., a result exponent, after normalization and rounding, less than the smallest representable exponent for the no exception occurs. When this bit is clear, data type). <05> INTEGER OVERFLOW TRAP ENABLE When this bit is set, it forces an integer overflow trap after execution of an instruction that produced an integer result that overflowed or had a conversion error. When this bit is no integer overflow trap will occur, however, the clear, condition code V bit will still set. <04> TRACE ENABLE it When this bit is set at the beginning of an instruction, is Pending Trace When will cause Trace Pending <30> to set. before taken is fault trace a set at the end of an instruction, the execution the next instruction. of When TP is clear, no trace exception occurs. <03:00> CONDITION CODES: N, 2, V, C N BIT - When set, the N (negative) condition code bit indicates instruction that affected this bit produced a that the 1last If this bit is clear, the result was positive negative result. or zero. K-49 MACHINE CHECK STACK FRAME PSL BIT DESCRIPTIONS PSL PROCESSOR STATUS LONGWORD (ESCRATCH LOC: 2F - T1F) K} COMPAT ABILITY MODE 29 30 28 7 TRACE // 27 26 25 24 23 CURRENT PREVIOUS FIRST INTERUPT | ACCESS MODE PENDING DONE 1 . 22 21 ACCESS MODE 0 1 . 0 20 // 19 18 17 16 INTERRUPT PRIORITY LEVEL 4 . 3 " ] 03 2 | i 1 02 01 i 0 00 MR- IR Z BIT - When the 1last which was non-zero. V BIT - set, the Z (zero) condition code instruction which affected this bit zero. When this bit is clear, indicates ‘ When set, the that V (overflow) condition code bit the last instruction which affected this bit conversion error or produced a result whose magnit large to be properly represented in the result. When this bit is clear, error or overflow. the operand there was BIT - When set, the C (carry) condition that the last instruction which affected out into the was no of the most carry or most indicates either had a ude was too which received no conversion , C carry that produced a result the result was significant significant borrow. K-50 bit. bit of code this the When this bit bit result bit is indicates either had or a clear, a borrow there MACHINE CHECK STACK FRAME BIT DESCRIPTIONS | | EHSR ERROR HANDLING STATUS REGISTER EHSR IPR: 4A (ESC.DA) 3 EHM FIXING FBM ] \ FBA cs FBOX DRAM 26 27 28 29 30 | ~ 25 IBOX ORAM IBOX (S 12 " | MEAR | PROCESS | rpap | SAVED | ABORT 104 16 17 18 19 20 21 22 23 24 MBOX MICRO-TRAP VECTOR ADDRESS | . 07 10 , 06 . 04 05 ' \ 03 A 4 02 ) 00 01 MR <31:27> EHM IS 13929 FIXING The EHM sets the appropriate bit in this field just before it sets the corresponding bit in EBCS <31:27>. Setting a bit in EBCS <31:27> causes the EBox to interrupt the Console for Control Store or Dispatch RAM correction. <26> MEAR SAVED ‘When the EHM is called to handle an MBox (ERF) micro trap, it saves MEAR in ESC: Later, when the EHM is servicing the MBox with the ERF micro trap, it checks Error Register Full DB and sets this bit. interrupt associated this bit to determine whether it should get MEAR from the MBox or ESC: <25> DB. | PROCESS ABORT The EHM sets this bit if it determines that: 1) An MBox CPR Parity Error failed to result in an MBox Fatal Error, or 2) The EBox failed to detect bad data on the OPBus. If this bit 1is set, VMS will either Bugcheck the user or the system. <24> MBOX TRAP TO 4 This bit indicates the occurrence of an MBox trap to 4. When running with the the IPL above 1D this may be the only sign of an SBE. It is used by VMB to check for single bit errors. <23:16> MICRO-TRAP VECTOR ADDRESS The EHM saves the entry field. VECTOR 2 4 6 8 8 8 10 10 10 10 The level vector addresses trap vector address are: REASON FBox error (Called by FBox interrupt handler) EHM detected a process abort condition during ~a MBox ERF micro-trap MBox error (Called by MBox interrupt handler) EBox MBox error fatal error EBox EBox IMD read and IBox error IMD read and TB parity error TB parity error (Ebox port request only) EBox Op-Port-Write and IBox error IBox Op-Port-Write and TB parity error K-51 in this MACHINE CHECK STACK FRAME BiT DESCRIPTIONS EHSR EHSR ERROR HANDLING STATUS REGISTER IPR 4A (ESC.DA) 3t FBM €S 30 29 28 27 EHM FIXING , 15 FBA CS FBOX , 14 IBOX DRAM DRAM 13 12 26 25 MEAR _ 24 PROCESS 23 04 " 10 09 21 20 19 18 17 16 . 180X | SAVED | ABORT | JOAP CS 22 MBOX . 08 07 (mBox . 06 | MICRO-TRAP VECTOR ADDRESS \ A 05 04 - FBOX 03 ) A 02 , 01 00 EHM FIXING rpox . PR EBOXB ‘ GPR EBOXA ‘ i IBOX v % SERVICE REQUEST ENTERED | ENTERED SERVICE REQUEST | GPR MR- 13829 VECTOR 18 18 18 18 18 1E 1F REASON EBox EBox EBox EBox EBox EBox RLog <15:08> RESERVED <07> MBOX (Cont) fork and IBox error fork and TB parity error ID read and IBox error string read and IBox error string read and TB parity error sync failure unwind parity error SERVICE REQUEST | the interrupt handling microcode determines that it was called to handle an MBox error interrupt, it will set this flag and call the EHM. The EHM will test this flag and, if set will process the MBox error. If <06> EHM ENTERED This flag is condition. occurs before used That the by the EHM to detect a 1is, those cases when EHM first error and pass routine control is able to VMS. to double error a second error finish processing trap trap the This flag is clear (which checked each time EHM is entered. If the flag 1is is the expected case), then EHM will set this flag and process the error in the normal manner. If the flag is set, however, indicating that EHM was in the process of handling an error when it was calleto d handle a second error, then one of two things will happen: | l. If the second error was detected by either the EBox or the MBox fatal error detection circuitry, then EHM will loop at UPC 21. This in turn will cause a Keep Alive Fail condition and the Console will print the following message and capture (Snap Shot) the state of the system: " Attempting to save machine state due "MACHINE DOUBLE ERROR" K-52 to" MACHINE CHECK STACK FRAME BIT DESCRIPTIONS EHSR ERROR HANDLING STATUS RECISTER ENSR IPR: 4A (ESC.DA) k)| ‘ 28 27 26 IBOX DRAM ?SQK SAVED 29 30 ~ EHM FIXING | FBM Cs | FBA Cs | FBOX DRAM MEAR 25 24 PROCESS | yaap ABORT | 104 08 MICRO-TRAP VECTOR ADDRESS 06 07 05 04 00 01 02 03 l i i & i [] | i 16 17 18 19 20 21 22 23 MR- 13929 2. If the second error was detected by the IBox then EHM will put a code of 5 in CSM.STATUS (ESC: C0) and call the CSM.ENTRY.DE routine. This will result in a Keep Alive Fail Condition and the Console will print the following message and Snap Shot the state of the system: " Attempting to save machine state due to" "CPU ERROR HALT" NOTE When EHM finishes processing the error, it will set the VMS ENTERED flag, clear this flag and call the VMS Machine Check Handler. <05> | VMS ENTERED This flag is similar to the EHM ENTERED flag. detect the It is wused to case where the VMS Machine Check Handler is in the - process of handling an error when a second error is detected. EHM sets this flag just before it «calls the Machine Check The Machine Check Handler processes the errors and Handler. clears this flag just before it executes an REI to continue the operation (or a BUGCHECK to halt the operation). | If a second error trap occurs while the Machine Check Handler is still processing the first error, then the EHM routine will process the error in the normal manner. That is, build a Stack roll back the PC's and Frame, clear the error condition, determine if it should call VMS. K-53 MACHINE CHECK STACK FRAME EHSR BIT DESCRIPTIONS ERROR HANDLING STATUS REGISTER IPR: 4A (ESC.DA) 3 30 FoM FBA cs CS 15 14 . 29 28 EHM FIXING |, 27 FBOX 80X ORAM DRAM 13 12 26 1BOX _ 25 24 23 22 21 MBOX PROCESS MERR | Moo ;g;f MEAR . 10 09 08 . 19 18 17 16 MICRO-TRAP VECTOR ADDRESS CS 1" 20 07 . . 06 ) 05 7 gg%c‘ £y 04 l ) . 03 FBOX IS SERVICE | Fpox . 02 01 00 EBOX EBOX 1BOX EHM FIXING MR- I020 However, was since the processing an VMS ENTERED error when flag a is set second (indicating error was " Attempting to save machine state due "CPU ERROR HALT" to" that VMS detected), EHM will not call VMS. Instead, it will put a code of 5 in CSM.STATUS (ESC: CO) and call the CSM.ENTRY.DE routine. This in turn will result in a Keep Alive Fail condition and the Console will print the following message and capture (Snap Shot) the state of the system: ‘ <04)> FBOX SERVICE REQUEST The micro-routine that handles FBox problems will set this flag if it determines that the problem was caused by an FBox hadrware error. The routine will then call the EHM routine to process it was <03> the error. called The EHM will test to handle an FBox error. FIX FBOX GPR PE by the EHM when Set this flag to attempts to correct an FBox it attempts to correct an EBox GPR B parity EHM when it attempts to correct an EBox GPR A parity FIX IBOX GPR PE Set by the EHM when it attempts to correct an IBox FIX EBOX GPRB Set by the FIX EBOX GPRA Set <00> by the GPR parity PE EHM when error. <01> if it error. <02> determine PE error. K-54 GPR parity T Ty e RIPT:._ MACHINE CHECK STACK FRAME BIT DESC CSM.STATUs. CSM.STATUS K} 28 29 ESCRATCH LOC.0C0 30 23 % 25 % 27 22 21 20 19 ERROR STATUS DOUBLE ERROR WAS DETECTED) (IF NON-ZERO - INDICATES THAT A NON-EBOX DOUBLE i i i L § 12 13 t 10 'l 09 ] 08 a7 05 CSM STATUS WORD 17 16 01 00 V 04 03 02 CSM ENTRY CODE CSM EXECUTION STATUS 0 = CSM RUNNING 06 18 NON-0 = VAX ISP RUNNING i 1 AR S0NS CSM.STATUS (CSM Status Register) ocode This is the status word used to control the Console Support Micr (CSM) . <31:16> It is located in ESC CO. If this field is equal to =zero. EBox This field is normally double error non-zero then it indicates that a non- DOUBLE ERROR STATUS condition has occurred. <15:08> EXECUTION STATUS indicates that CSM is 1f this field is equal to 1fzerothisit fiel d is not equal to zero executing in the EBox. e (i.e., non-zero) then it indicates that the VAX ISP microcod . is executing in the EBox <07:00> ENTRY CODE was This field contains a code that indicates reason why CSM started. Code 00 04 05 06 07 08 09 OA OB 10 11 15 16 Reason CSM is not executing. Interrupt Stack not Valid (Software Error). A non-EBox double error was detected. Mode. Halt Instruction was decoded in Kernel ). SCB Vector Code <1:0> = 3 (Software Error . code) Micro WCS SCB Vector Code <1:0> = 2 (no Pending Error on Halt. = 1. CHMx Instruction decoded and Interrupt Stack 0. # <1:0> r Vecto and ed CHMx Instruction decod d the Micro-break was encounterd which cause :. MICRO NTRY. CSM.E at console to start CSM Console Halt Pending was set which caused CSM to | start running on AFork at A.CSM.ENTRY.MICRO:. ed CSM with CSM.ERROR: routine. The Conso see this code. If it does then there something very The CPU was powered up and the Console start at CSM.ENTRY.PO:. used by During the power up sequence this code is inter face to the FIND 64KB and FIND RPB procedures should never le wrong in the CPU. K-55 TM %, - ey TM e APPENDIX L SBIA STACK FRAME BIT DESCRIPTIONS The VMS machine check handler will build frame (entry code for 13) SBI ALERT, adapter I/0 an SBI FAULT, and (1I0A) SBI will build In addition, the VMS machine check handler IOA stack frame if it was called to handle a machine check (entry code I/0 ' error 4) and it determines that the machine check is related to an interrupts. (i.e., CP IO BUF ERROR was set during This appendix contains bit and and status registers (listed a CP read operation). field descriptions for the major below) that make up an IOA stack Register Name TOADR SBIA SBI Time Out Address SBIERR SBIA SBI Error L-3 MAINT SBIA Maintenance L-5 Page SBI CSR Silo Compare Fault/Status Register SBIA Configuration Register SBIA Control and Status Register ERRSUM SBIA Erro r DIAGCS SBIA Diagnostic Control and Status DMAI Command/Address Register DMAI ID Register DMAA Command/Address Register DMAA ID Register DMAB Command/Address Register DMAB ID Register DMAC Command/Address Register DMAC ID Register SILOCOMP SBIA SBI SBISTS SBIA SBI CR DMAICA SBIA DMAIID DMAACA SBIA DMAAID DMABCA SBIA DMABID DMACCA SBIA SBIA DMACID SBIA SBIA SBIA Summary L-2 L-8 L-10 L-13 L-14 L-15 L-21 L-24 L-25 L-24 L-25 L-24 L-25 L-24 L-25 frame. SBIA STACK FRAME BIT DESCRIPTIONS SBIA SBI TIMEOUT ADDRESS TOADR 881 TIMEOUT ADDRESS REGISTER 2008 0038, 2208 0038 30 29 28 27 26 25 24 23 22 21 7, . // ‘ | 27/ 15 14 13 12 R " I 10 20 19 18 17 16 SBI LONGWORD PHYSICAL ADDRESS N 09 T 08 07 06 T L O L \ 31 ' 05 04 03 02 01 02 01 00 SBI LONGWORD PHYSICAL ADDRESS 13 i 12 L 1 3 10 § 09 1 08 § 07 ] 06 ¥ 05 i 04 L 03 ' | 00 The SBI timeout address register holds the address for all CPU transactions in the SBIA. When an error occurs on a CPU transaction, the timeout address register is locked. (31:28> 27:00> MBZ (SS36) These bits are forced to zero by the zero fill logic. SBI LONGWORD PHYSICAL ADDRESS - BUS REG D <27:00> (SS27) The Timeout Address Register 1is 1loaded with the physical address, for the CPU command, each time a command/address is transferred from the file data 1latch to the command/address latch. It will be locked if Error Summary Register bit <23> is set. v %@} j ‘.} } x.j \%} \) mj \,_) \J mj aj xgg) \j x) § )P RVEO NNV RO RURORY 14 . [l », ) 15 NOTE: ALL BITS READ ONLY BITS <31:28> READ AS ZEROS SBIA STACK FRAME BIT DESCRIPTIONS ERROR REGISTER SBIA SBI $81 ERROR REGISTER | | wm / 08 10 05 06 07 00 01 02 03 04 NOTE: BITS <31:16> READ ONLY (UNDEFINED) BITS <15:13>, <9>_ AND <07:00> READ ONLY AS ZEROS error SBI The transactions confirmation. <31:16> <15:13> ZERO (SS36) ZERO (SS32) register which failed stores information CPU about on the SBI due to timeout or error These bits are read as zeros provided by the zero fill logic. These read only bits are forced to zero by ground hardware potential. 12> CP TIMEOUT BUS REG D<12> (SS32) This bit will be set when there is an SBI timeout reference for one of the following reasons: 1. on a CPU Unsuccessful access: When the SBIA does not receive an acknowledge confirmation for a CPU command/address or write from the microseconds) " data within 512 SBI cycles (102.4 access Unsuccessful the SBIA first requests the SBI. time can be caused by the following: a. b. c. d. 2. SBIA is unable to win the SBI through bus arbitration Target NEXUS is always busy when accessed The address is for a non-existent device or address Combinations of the first two If the SBIA does not receive the read data within cycles of the acknowledge for the command/address, read data 512 timeout. SBI it 1s a When this bit sets, Error Summary Register <23> will be set, locking Error Summary Register <31:26> (type of reference), SBI Error Register <11:10, 08>, and the Timeout Address Register (referenced address). This bit is reset when the CPU writes <11:10, 08>. it to a one. This will also reset | SBIA STACK FRAME BIT DESCRIPTIONS SBIA SBI ERROR REGISTER SBIERR $81 ERROR REGISTER \ 2008 0034, 2208 0034 //////////////////////////////////////////////////////////////// 09 NOTE: BITS <31:16> READ ONLY (UNDEFINED) BITS <15:13>, <9>, AND <07:00> READ ONLY AS ZEROS LY <11:10> CP TIMEOUT STATUS <01:00> REG D <11:10> (SS32) The timeout status bits are made ITIREN Sl BUS 1. 2. Bit <01>: Bit <00>: Together State machine is in the SBI confirmation equals these timeout. two signals a. 00: SBI 01: c. d. 10: 1ll: Device was busy (Busy) Waiting for read Cannot happen data NEXUS are the locked ZERO (SS32) These read only bits CPU SBI BUS a <08> are follows: read pending state. 01, busy. type (No of SBI Response) Error Summary Register <23>, SBI Error Register <12>. forced to zero by hardware and ground CPU (SS32) set when the SBI state machine enters the error if the SBI NEXUS has returned an error confirmation read/write Summary command/address cycle. If this bit is set, will be set, locking the Timeout Summary Register <31:26>, and bits register. This bit 1is reset when the CPU Register Register, Error Address <12:10> of this writes bit <12>. <07:00> by respond as ERROR CONFIRMATION REG D Error not the two signals m This bit is abort state on did CPU writes potential. <08> indicate b. These bits reset when <09> up of ZERO (SS32) These read only bits pctentlal. <23> | | are forced to zero by hardware ground SBIA STACK FRAME BIT DESCRIPTIONS SBIA SBI MAINT REG $BI MAINTENANCE REGISTER MAINT 2008 0044, 2208 0044 16 23 24 25 26 27 MAINTENANCE ID <04:00> | WW?{‘ER 02 03 04 01 00 08 07 05 04 T romcrE| U5 FORC A vorce V4 FORCE NOTE: BITS <22:12> AND <10:09> AND <07:05> READ ONLY AS ZEROS TR TN tool. This register is used as a diagnostic and maintenance Operational software does not use this register. <31> FORCE PO REVERSAL SS24 FRC PO REV ON SBI When set, this bit will cause parity on the for SBI all the SBIA to | transmit includes CPU read/write and DMA read data. <30> bad SBIA to SBI transactions. PO This FORCE WRITE SEQUENCE FAULT SS24 FORCE WSQ FAULT When When set, this bit will force SBI TAG <01> to a logic 1. force will it register, nexus SBI an to write CPU a used with the write data tag to 111, the diagnostic tag. This will cause a write sequence fault because SBI devices are looking for a | tag of 101, write data. <29> FORCE UNEXPECTED READ FAULT | SS24 FORCE UNEXP READ When this bit js set, the maintenance ID, bits <27:23>, with a TAG of zero, will be repeatedly transmitted on the SBI (the When the nexus, as selected by the data is undefined). read data (TAG = 0), it should assert receives maintenance ID, BUS SBIT FAULT because of the unexpected read data. <28> FORCE MULTIPLE TRANSMITTER FAULT SS24 FORCE MULTI Setting this bit forces a multiple transmitter fault in any ID with the selected nexus. The CPU will load the maintenance ID of the selected nexus, then read that nexus configuration after the command/address 1is cycle the On register. transmitted on the continually SBI, the SBIA will enable the SBI to transmit a TAG = 111 with the maintenance ID (data is undefined). When the nexus transmits the read data, the ID transmitted by It will be ORed with the the nexus 1is the SBIA's 1ID. are not masked, will bits the as long as and ID, nce maintena cause the nexus to detect a multiple transmitter fault. SBIA STACK SBIA SBI FRAME BIT DESCRIPTIONS MAINT REG MAINT SBI MAINTENANCE REGISTER 2008 0044, 2208 0044 31 FOR PO 29 28 |FORCE | FORCE 30 FORCE ~ REVERSAL |SEQUENCE | READ ONSBI 27 |FAULT- 15 XMITTER |rauLr | FAuLT 13 12 14 26 04 03 ; 1 24 02 L 10 FORCE ‘ 25 23 22 MAINTENANCE 1D <04:00> WRITE |UNEXPCTD | MULTIPLE 1p1 09 01 21 20 19 \ ) 00 ‘ ' 50 L 08 i 18 17 ‘ ‘ HIIIIIAA 04 03 s [Fomce TR 16 02 ‘ ‘ // ISP TP/ 01 00 77 | GG | romce | MSE INTERUPT | Seauence| TaNANCE | ISR oatA | ipTMt { REVERSAL | ] NOTE: BITS <22:12> AND <10.09> AND <07:05> READ ONLY AS ZEROS <27:23> MAINTENANCE SS24 MAINT These bits following <22:12> <11> ID ID <04:00> <04:00> are used to instances: generate the maintenance l. Generation of unexpected read fault 2. Generation of multiple 3. Used by 4. Used to check MBZ (SS33, Must be FORCE the SS36, ID 1in the | silo as the transmitter compare fault ID ID logic SS32) zero. Pl REVERSAL ON SBI SS24 FRC Pl REV ON SBI When set, this bit will cause parity on the SBI for all the SBIA to transmit bad Pl SBIA to SBI transactions. This includes CPU read/write and DMA read data. <10:09> MBZ Must <08> (SS32) be zero. FORCE SS24 READ DATA TIMEOUT FORCE TIMEOUT bit will preset the | This state machine timeout counter to all ones when the state machine enters the read wait start state. The timer will expire on the first count, generating a timeout condition while waiting for CPU read data. <07:05)> MBZ Must <04> (SS32) be zero. FORCE SBI INTERRUPT REQUEST SS24 MAINT REQ ENA When set, this bit will enable SBI Silo Comparator Register <07:04> and <03) to force interrupt requests and ALERT on the SBI. SBIA STACK FRAME BIT DESCRIPTIONS SBIA SBI MAINT REG $8! MAINTENANCE REGISTER MAINT 2008 0044, 2208 0044 3 28 27 26 Jonsei JFauLT |FauLT FAULT @ @ 12 H FORCE 30 |FORCE 29 |FORCE | FORCE REVERSAL , READ 15 14 13 XMITTER R T 10 ‘ 25 ! 2. 24 23 Wil 22 21 20 ks 19 08 09 18 17 16 ) 0 . 0 /5’/5?2;/’?’/5”4’//%//;/’%/// /3"‘///%//{' OIS L Iy, ; 07 7 06 05 07/ 7 04 03 A FORCE |romce | FORCE | 01 A 00 5 NOTE BITS <22:12> AND <10 09> AND <07 05> READ ONLY AS ZEROS <03> FORCE TR SEQUENCE SS24 FORCE TR SEQ This bit will enable SBI assert TR <15:00> on transmitted. It is used to test the SBIA and SBI <02> Silo Comparator Register <15:00> to the SBI when a CPU Command/Address is in conjunction with a loop back read nexus arbitration logic. FORCE MAINTENANCE TR SS24 FORCE MAINT TR This bit will unconditionally assert the SBI Silo Comparator Register <15:00>. <01> FORCE SS24 TR corresponding to INTERRUPT SUMMARY READ DATA FORCE ISR DATA This bit is used to enable the SBIA to respond to an interrupt summary read to check out the circuitry that prioritizes the ISR data and generates the vectors. During the response cycle of an interrupt summary read, the SBIA will enable the write data latch to be transmitted on the SBI along with a TAG, MASK, and ID of zero. <00> USE MAINTENANCE ID SS24 USE MAINT ID of This bit will enable the use <27:23> for diagnostic purposes. the maintenance ID, bits SBIA STACK FRAME BIT DESCRIPTIONS SILO COMPARE SBIA SBI SiLcomP | 2008 0040. 2208 0040 3 30 29 COMPAR- | siL0 ATOR 28 ‘ LOCK Lok 27 sio | INteRupr |pRRIOH | o LOCK ENABLE 15 14 26 CONDITIONAL LOCK COOE 13 A 12 25 24 23 22 21 03 " 10 1 02 01 1 09 } 08 19 00 02 07 06 1 of 14 3 13 1 1 05 00 03 04 03 MAINTENANCE SBI REQ <07:04> 12 1" 1 10 b i 09 | 08 07 i 06 05 i L 17 16 COUNT FIELD MAINTENANCE TR <15:00> 15 18 COMPARATOR TAG COMPARATOR COMMAND/MASK 00 20 881 SILO COMPARE REGISTER ALERT 04 1 02 01 1 02 L o1 00 00 | 03 02 01 i 00 | i NOTE: BITS <15:00> WRITE ONLY/READ AS ZEROS <31> COMPARATOR SILO LOCK SS21 CMP SILO LOCK The Comparator Silo Lock counter has interrupted CPU <30> loads reached if bit the silo bit F. <30> set. field SILO LOCK INTERRUPT ENABLE SS25 SILO LOCK INTR EN The CPU sets this bit to enable SILO LOCK, set When is count is | if this This with an sets. a the bit bit count sets, is count in the the CPU cleared other interrupt when silo will when than F. b1t <31>, be the CMP <29> LOCK UNCONDITIONAL SS25 LOCK UNCOND When this bit is set, the silo counter will count on each SBI cycle. It will cause a silo lock within 16 SBI cycles, depending upon the count that had been previously 1loaded into the silo count field. <28:27> CONDITIONAL LOCK CODE <01:00> SS25 COND LOCK CODE <01:00> These two bits determine the comparisons that will enable counting the silo counter to achieve a silo lock. 1If the SBI data matches the silo comparator bits, for the enabled comparison, the counter is incremented. The conditions are as follows: <26:23> 1. 00: No compare 2. 0l1l: SBI ID 3. 10: SBI ID and SBI TAG 4. 1l: SBI ID and SBI TAG and (no comparison SBI is made) COMMAND/MASK COMPARATOR COMMAND/MASK COMP CMD/MSK <03:00> These bits provide the base for the silo compar ison, when it is enabled to compare the command/mask. If the SBI tag is 011, command/address, then this field is compared with SBI B <31:28>, the SBI function. If the SBI tag is other than 011, this field is compared to the SBI mask bits. SS25 SBIA STACK FRAME BIT DESCRIPTIONS SBIA SBI SILO COMPARE 881 SILO COMPARE REGISTER sILCOMP 2208 0040 2008 0040, 30 31 figm’m s & 14 i 13 i 12 ‘ m 03 1 i 10 22 23 24 25 COMPARATOR COMMAND/MASK 02 i 3 09 10 1" 12 13 14 15 15- 01 )1 00 | %’:‘%ut« }&Q&W ENABLE | ng LOCK 26 27 28 fgggg%%kt 29 liock 09 i o1 3 08 00 02 07 06 MAINTENANCE TR <15.00> i 08 J o7 i 01 05 1 o0 | o3 MAINTENANCE SBI REQ <07:04> 06 » 05 04 02 3 03 04 % 17 18 COUNT FIELD 19 20 21 COMPARATOR TAG ALERT 03 § 01 M 00 0 02 01 02 o000 NOTE: BITS <15:00> WRITE ONLY/READ AS ZEROS <22:20> COMPARATOR TAG SS25 COMP TAG <02:00> These bits provide the base for the silo comparison, when it is enabled to compare the tag. This field is compared with SBI ) TAG <02:00>. <19:16> COUNT FIELD SS19 COUNT FIELD The CPU loads the silo counter with the two's complement of the number of SBI cycles to be loaded into the silo after a comparison is made. When the count reaches F, the silo is The counter is also enabled by the Lock Unconditional locked. | bit <29>. <15:00> MAINTENANCE TRANSFER REQUEST <15:00> SS25 MAINT TR <15:00> These bits provide the means to simulate SBI transfer requests, SBI interrupt requests, and SBI alert for diagnosing the interrupt logic and SBI priority arbitration logic. They also provide a means of testing the lower 16 bits of the silo. They are controlled by SBI Maintenance Register bits <04:02> as follows: l. The asserted corresponding the cause will MAINT TR <07:04> bit SBI REQ <07:04> bit to be asserted if SBI Maintenance Register <04>, MAINT REQ ENA, is set. 2. 3., If MAINT TR <03> and SBI Maintenance Register <04> are both If SBI Maintenance Register <03> is set, the asserted MAINT TR <15:00> will cause the corresponding SBI TR <15:00> to be asserted when a CPU command/address 4. | | set, SBI ALERT will be asserted. 1is transmitted on SBI TR the SBI. See SBI Maintenance Register bit <03>. MAINT TR <15:00> <15:00> to be will asserted <02>, FORCE MAINT TR, cause if is set. the corresponding SBI Maintenance Register bit SBIA STACK SBIA SBI FRAME FAULT BIT DESCRIPTIONS STATUS REGISTER $8ISTS SBI FAULT/STATUS REGISTER 2008 003C. 2208 003C X3 S8t PARITY FAULT 15 30 29 28 27 - SEQUENCE 1pata 25 AIISS 24 23 22 Bi SBI # |sequence| MITTER | DURING W%/// PARITY | PARITY [ } YR (YT 13 12 14 26 N2 RININ 77777777077 R R 21 y 20 19 18 17 16 //// LATCH | INTERUPT| FAULT | SILO % ENABLE L _WiRe | LOCK 01 o NOTE: BITS <31:20> AND <17:00> READ ONLY The SBI on transactions fault/status register saves in which the involved. All SBI devices monitor check for errors. 1If an error is <31> asserted and register for non-CPU devices) SBI PARITY a fault/status information SBIA may the SBI detected, register is (part on or may of SBI not faults have been every cycle and the Fault wire is the locked. configuration FAULT SS11 FAULT REG B <31> This bit will set if the SBIA detects an SBI parity error on received SBI information. Bits <23:22> will indicate whether it was an address/data or control parity error. The register is written every SBI cycle, and 1locked when SBI FAULT is asserted. deasserted. <30> WRITE This This SEQUENCE bit bit will be cleared when is only valid if bit <19> SBI is FAULT set. is FAULT SS11 FAULT REG B <30> This bit is set if the SBIA is expecting write data, and receives SBI information, with no parity error, but the tag does not indicate write data (101). This bit is also 1locked when SBI FAULT is asserted, and clears when SBI FAULT clears. This bit is only valid if bit <19> is set. <29> UNEXPECTED READ DATA SS11 FAULT REG B <29> This bit sets if the SBIA receives information with the SBIA ID (10000) with a read data tag (000), but the SBIA is not expecting read data (no read pending). This bit is locked when SBI bit <28> FAULT is is asserted, only valid INTERLOCK SEQUENCE SS11 This for not if and bit clears <19> when is set. SBI FAULT clears. This FAULT FAULT REG B <28> bit is set if the SBIA receives | | a valid command/address the interlock flip-flop is an interlock write masked but set (an interlock read has not occurred). This bit is also locked when SBI FAULT is asserted, and clears when SBI FAULT clears. This bit is only valid if bit <19> is set. L-10 SBIA STACK FRAME BIT DESCRIPTIONS SBIA SBI FAULT STATUS REGISTER $BI FAULT/STATUS REGISTER SBISTS 2008 003C. 2208 003C K} sBlPARITY FauLT 30 |wRiTE READ |SEQUENCE | frauLr |CATA >f %%%%ZZzfi iéé09%%%%% Aé%Z%%%?f ‘ NOTE: BITS <31:20> AND <17-00> READ ONLY MULTIPLE TRANSMITTER FAULT SS11 FAULT REG B <K27> 27> the not 1is This bit will set if the SBIA detects an ID that same as the 1ID it transmitted on the SBI. This bit is also FAULT locked when SBI FAULT is asserted, and clears when SBI set. is <19> bit if valid only This bit is clears. SBI TRANSMITTER DURING FAULT <26> SS11 FAULT REG B <26> | This bit sets when the SBIA was the nexus transmitting on the This bit is locked when SBI FAULT is asserted, and clears SBI. when SBI FAULT clears. This bit is only valid if bit <19> is set. <25:24> MBZ (SS33) Must be zero. SBI Pl PARITY ERROR <23> SS11 FAULT REG B <23> This bit indicates that an SBI parity <31:00>. It is error was only valid if bit <19> is set. when SBI FAULT is asserted, and will clear B SBI It is locked over when SBI FAULT clears. SBI PO PARITY ERROR SS11 FAULT REG B <22> <22> This bit indicates that an SBI parity error was over SBI TAG <02:00>, SBI ID <04:00>, and SBI MASK <03:00>. It too, is only valid if bit <19> is set, and is locked when SBI FAULT is set. It will clear when SBI FAULT clears. <21:20> ~ MBZ (SS33) Must be zero. L-11 » SBIA STACK FRAME BIT DESCRIPTIONS SBIA SBI FAULT STATUS REGISTER SBISTS | SBI FAULT/STATUS REGISTER 2008 003, 2208 003C 31 S8 30 Iwame 29 - . 28 [READ 15 26 | MULTIPLE | |tock PARITY |SEQUENCE Inata FAULT 27 |UNEXPCTD[INTER- | TRans- [sequence| MITTER | DURING [FAULT feayrr fraur 14 13 12 7 25 58I Lraut | raor 1" o L N 7 | xmiTTER SBI 10 09 =& PO 21 20 ZA enror_| 08 18 17 P ERROR 07 19 | / PARITY | PARITY P77 . o P1 LATCH | 7 06 05 o 04 03 FAULT 16 | o INTERUPT | FAULT | SILO ENABLE | WIRE | LOCK 02 o1 " 2B § NOTE: BITS <31:20> AND <17.00> READ ONLY <19> FAULT LATCH REG D <19> (SS33) | | If an SBI nexus, including the SBIA, detects an SBI fault, the nexus will assert SBI FAULT. The SBIA, upon reception of SBI FAULT will set the fault 1latch, which will keep SBI FAULT BUS asserted. It fault by writing a latch asserted 1. 2. 3. 4., 5. for the will one following asserted to bit SBI error until <19>., the SBI CPU clears the will be FAULT conditions: Interlock sequence fault Unexpected read fault Write sequence fault Multiple transmitter fault Parity fault When this bit sets, will be locked. <18> remain Fault/Status Register <31:26> and <23:22> , FAULT INTERRUPT ENABLE SS21 FAULT INTR ENA The CPU will set this bit by writing bit <17> to enable an SBI fault to generate an interrupt. The interrupt will be asserted if the fault latch, bit <19>, is set. <17> <16> <15:00> SBI FAULT SS33 BUS This bit WIRE REG D 17> indicates the state of FAULT SILO LOCK SS33 BUS REG D 16> This bit will be set when the It will be reset when the CPU MBZ Must (SS36) be zero. L-12 the SBI silo locks resets the FAULT signal. due to an SBI fault. fault latch, bit <19>, SBIA STACK FRAME BIT DESCRIPTIONS SBIA CONFIGURATION REGISTER CONFIGURATION REGISTER 25 26 24 23 22 21 20 07 06 05 04 MEMORY SEPARATOR ABUS ADAPTER TYPE SBIA NOTE: BITS <07:00> READ ONLY <31:30> MBZ (SS28) Must be 2zero. <29:20> MEMORY SEPARATOR ' SS28 MSR <27:18> This field defines the memory address boundry and is equal to the number of megabytes of memory addressable over the ABus. If bit 29 is asserted, there are 512 MBytes of memory and bits 1 ¢28:20> are disregarded when the hardware checks the DMA address. These bits are bits <29:20> in the Memory Separator register, but within the SBIA they are shifted right by two bits to match the physical address. <19:08> MBZ (SS36) Must be 2zero. <07:04> ABUS ADAPTER TYPE BUS REG D<07:04> (SS28) These bits identify the type of SBIA. <03:00> ABus adapter, 0001 for the | ABUS ADAPTER REVISION BUS REG D<03:00> (SS28) These bits identify the revision of bits are hardwired. L-13 the ABus adapter; these SBIA STACK FRAME BIT DESCRIPTIONS SBIA CONTROL STATUS REGISTER CSh CONTROL AND STATUS REGISTER 2008 0004, 2208 0004 31 30 29 28 27 26 25 24 13 12 1" 10 09 08 MASTER | ENABLE INTERUPT SBI 20 19 18 17 04 03 02 01 ENABLE | CYCLES -OUT 07 06 05 00 TM NOTE: BITS <28:00> READ ONLY AS ZEROS <31> MASTER SS29 ‘ INTERRUPT MSTR INTR ENABLE ENA When set, this bit will enable interrupts and generate the level <30> for CPU the SBA module to prioritize the appropriate interrupt priority polling. | ENA SBI CYCLES OUT SS29 ENA SBI OUT This bit must be set for normal operation. to access SBI NEXUS registers. an SBI NEXUS register with this condition, set. <29> , See ENA SBI and ERROR description CYCLES SUMMARY of If bit It enables the CPU reset, register ERROR SUMMARY attempts it is <20> and register. the CPU to access an error <19> will be IN SS29 This ENA SBI IN bit must also be set for normal operation. It enables all DMA activity through the SBIA. 1If this bit is not set, the SBIA will not recognize SBI function codes and will not respond to <28> SBI ZERO This <27:24> (SBI confirmation is 00, no response). (SS33) read only bit will always be zero. CPU TR SELECT <08:04> CPU TR SEL <08:04> (SS07) These bits provide backplane visibility of select the SBI TR for CPU transactions. field <23:00> commands is the two's complement of the The the TR level. jumpers used to contents of this ZERO (SS36) These bits logic. are always read as L-14 zero provided by the zero Ffill SBIA STACK FRA ME BIT DESCRIPTIONS SBIA ERROR SUMMARY 29 8 01 o0 13 12 aon MBOX 02 MAC TRANSACTION] BUF|FER 26 2 | | ,] ienerwstatus o0 00 VK ‘ A wmmawmmawmmm SBIA 09 08 W MBOX ADPE CN 10 *H”fl ? ** ‘, : ; 2 o WEMED WTEGTEB UETEGTED 07 % 0 A/D P 0'5 h 04 ] 03 1Y 02 01 A NEAC T 3“‘:: 3} Hl"wflv 00 ,&ii . PE_3 ERROR NOTE: BITS <31:24> AND 4’&'22.19'} AND 4117.‘!69 READ ONLY AS ZEROS WR-EEL N of ‘tha information The error summary register contains most which have been detected by the SBIA on errors about transactions involving the SBIA. <31:28> COMMAND <03:00> BUS REG D<31:28> (SS26) These bits represent the read/write. register | ABus They command bits loaded are for a CPU each time command/address latch is loaded, and are latched by LOCK, <27:26> bit <23>. CPU 1I/0 the ERROR LENGTH/STATUS <01:00> BUS REG <27:26> (SS26) These bits represent the ABus data length for a CPU 1/0 register read/write. They are also loaded each time the command/address latch is loaded, and are latched by CPU ERROR LOCK, bit <23>. <25:24> MBZ <23> CPU BUFFER ERROR LOCK SS37 CPU ERROR LOCK Must be zero. This bit will be asserted for any of the following errors on CPU 1/0 read/write. l. A/D Parity Error 2. Control Parity Error 3. Address Error (bit <20>) 4. CPU read/write timeout on SBI 5. SBI Error (SBI Error Register <08>) a (bit <22>) (bit <21>) (SBI Error Register bit <12>) <31:26> and the If this bit is set, Error Summary Register If clear, these bits Timeout Address Register are latched. Writing this bit will represent the most recent transaction. will clear Error Summary Register <22:19,16>. L-15 SBIA STACK FRAME BIT i DESCRIPTIONS SBIA ERROR SUMMARY ERASUM ERROR SUMMARY REGISTER 2008 0008, 2208 0008 30 29 28 27 COMMAND 02 , 14 26 LENGTH/STATUS 0 00 13 12 DETECTED DETECTED DETECTED 1R RROR | | 24 23 22 ‘ BUFFER 0w, 0 Y Lock 1" 10 09 SBIA SBIA 0 JAIDPE ,CNTRL PE o Sé?l? EUFFE&EOX v 25 DETECTED OETECTED 21 20 19 earor | parmy | pamiry | ERROR . |oN 08 \ 18 p ‘ 06 0 s A/DPE MBOX | / ERROR 7 05 04 03 ‘ INTER- ,ERROR 16 CcPU | ERROR | ERROR 07 DETECTED 17 DETECTED DETECTED DETECTED | LOCK TIMEQUT 4 CNTRL PE 1 ERROR 02 01 SBIA DETECTEC SBIA DETECTED 00 MBOX _I OETECTED | NOTE: BITS <31:24> AND <22:19> AND <17:16> READ ONLY AS ZERQS <22> CPU A/D PARITY ERROR SBAN A/D PTY BAD This bit 1is set if address/data bits of a parity error is detected on the the command/address or write data for a CPU I/0 read/write. If the error 1is detected on the command/address, bit <19> will also be set. Parity is checked on the output of the file data latch. 1If this bit sets, bit <23> 1is set. This bit is cleared when the CPU writes bit <23>. <21> CPU CONTROL PARITY ERROR SBAN CNTRL PTY BAD This bit is set field of the - read/write. 1If cycle, output also if a parity error is detected on the control command/address or write data for a CPU I1/0 the error is detected on the command/address <19> will also be set. Parity is checked on the bit of the file set. This data bit latch. 1is If this bit sets, bit <23> cleared when the CPU writes also <23>. <20> CPU ADDRESS ERROR SS38 LOCAL ADR ERR This bit is set if the CPU accesses a nonexistent SBIA register or when an SBI NEXUS register is accessed when Control/ Status Register <30> is clear (CPU access to the SBI When it sets, it will set bit <23>, and it CPU writes bit <23>. This error will be command/address word 1is available, so bit is disabled). is cleared when the detected when the <19> should also be set. <19> is bit ERROR DETECTED ON COMMAND/ADDRESS CYCLE ERR ON C/A This read only bit is set if a address/data, control parity, or address error is detected on the command/address cycle. The SBAN setting of this bit will set when the CPU writes a one to <18> STATE MACHINE PARITY ERROR SBAO FORCE PARITY TRAP This error bit will be set if not contain cause a CPU and even if no CPU bit <23>. MBZ (SS33) be zero. Must parity. transaction generate occur <17> bit bit an to be interrupt. transaction L-16 is <23>. <23>. This bit will be the | state machine microword The occurrence aborted, A state if of one machine in progress so is this in reset does error will progress, parity error can it will not set SBIA STACK FRAME BIT DESCRIPTIONS SBIA ERROR SUMMARY ERROR SUMMARY REGISTER COMMAND | 00 01 02 26 27 28 29 30 LENGTH/STATUS o1 0 . 00 B0 | w&ma wamungmmn wzm‘m mmm E | INTER1 'SBIA SBIA - MBOX mmc MECTED msmmn | NOTE: BITS <31:24> AND <22:19> AND <17:16> READ ONLY AS ZEROS 16> MULTIPLE CPU ERROR SBAN MULT CPU ERR This bit can only be set if bit <23> is already set and a CPU addressing error 1is detected on the command/address cycle or there is an address/data or control parity error on the command/address or write data. This bit will not be set for a write data parity error for the transaction that sets bit <23>, but for a subsequent transaction. Bit <16> is reset when the CPU writes bit 15> 14> MBZ <23>. (SS32) Must be zero. DMAC TRANSACTION BUFFER SBIA DETECTED A/D PARITY ERROR SS30 DMAC A/D ERROR This bit will be set for a data parity error when the read data is being transferred from transaction buffer C to the SBI during a DMA read. This bit cannot be set if bits <13> or <12> have been previously set. This bit 1is cleared by the CPU writing it. The DMAC Command/Address Register and DMAC 1ID Register will be locked if this bit sets. The setting of this bit will generate a local interrupt. 13> DMAC TRANSACTION BUFFER SBIA DETECTED CONTROL PARITY ERROR SS30 DMAC CONTROL ERROR This bit will be set for a control parity error when the read data is being transferred from transaction buffer C to the SBI during a DMA read. This bit cannot be set if bits <14> or <12> have been previously set. This bit 1is cleared by the CPU writing it. The DMAC Command/Address Register and DMAC 1ID Register will be locked if this bit sets. The setting of this bit will generate a local 12> interrupt. 'DMAC TRANSACTION BUFFER MBOX DETECTED ERROR SS30 DMAC MBOX ERR This bit will be set if the MBox detects a parity error or NXM on the transfer of command/address from the DMAC transaction buffer. This bit cannot be set if bits <14> or <13> have been previously set. This bit 1is cleared by the CPU writing it. The DMAC Command/Address Register and DMAC ID Register will be locked if this bit sets. The setting of this bit will generate a <11l> local interrupt. v MBZ Must be zero. L-17 SBIA STACK FRAME BIT DESCRIPTIONS SBIA ERROR SUMMARY ERROR SUMMARY REGISTER 2008 0008, 2208 0008 31 | 30 2 29 28 COMMAND , o 14 27 12 1 DETECTED _,CNTAL PE ; ERROR 0, 21 CPU 20 19 CPU Lock__| 07 SBIA DETECTED DETECTED ADPE _, CNTAL PE MBOX DETECTED ERROR [on ERROR | ERAOR 06 05 U SBIA 0 DETECTED DETECTED ,ADPE 18 17 ' ,CNTRLPE PARITY 04 03 | INTER- cPU | RROR MBOX 16 MULTIPLE | CONTRoL | ADDRESS | DETECTED | MACHINE EAROR | PARITY | PARMTY | ERROR 10 SBIA 22 cPU BuFFER | AD AT 13 DETECTED DETECTED cw LENGTH/STATUS , Q“AS%&MNSACJ&?E EUFFEQEU}( JAIDPE 26 / | 02 ERROR (H 00 OMF DETECTED | SGM SBM DEJECTED UETECTED MBOX DETECTED NOTE: BITS <31:24> AND <22:19> AND <17:16> READ ONLY AS ZEROS <10> <09> DMAB TRANSACTION BUFFER SBIA DETECTED A/D PARITY ERROR SS30 DMAB A/D ERROR This bit will be set for a data parity error when the read data is being transferred from transaction buffer B to the SBI during a DMA read. This bit cannot be set if bits <09> or <08> have been previously set. This bit 1is cleared by the CPU writing it. The DMAB Command/Address Register and DMAB 1ID Register will be locked if this bit sets. The setting of this bit will generate a local interrupt. DMAB TRANSACTION BUFFER SBIA DETECTED CONTROL PARITY ERROR SS30 DMAB CONTROL ERROR This bit is set for a control parity error when the read data is being transferred from transaction buffer B to the SBI during a DMA read. This bit cannot be set if bits <10> or <08> have been previously set. This bit 1is cleared by the CPU writing it. The DMAB Command/Address Register and DMAB 1ID Register will be locked if this bit sets. The setting of this bit will generate a <08> DMAB TRANSACTION local interrupt. BUFFER MBOX DETECTED | ERROR SS30 DMAB MBOX ERR This bit is set if the MBox detects a parity error or NXM on the transfer of command/address from the DMAB transaction buffer. This bit cannot be set if bits <10> or <09> have been previously set. This bit 1is cleared by the CPU writing it. The DMAB Command/Address Register and DMAB ID Register will be locked if this bit sets. The setting of this bit will generate a local interrupt. | 07> <06> MBZ Must be zero. DMAA TRANSACTION BUFFER SBIA DETECTED A/D PARITY ERROR SS30 DMAA A/D ERROR This bit is set for a data parity error when the read data is being transferred from transaction buffer A to the SBI during a DMA read. This bit cannot be set if bits <05> or <04> have been previously set. This bit is cleared by the CPU writing it. The DMAA Command/Address Register and DMAA ID Register will be locked if this bit sets. The setting of this bit will generate a local interrupt. L-18 SBIA STACK FRAME BIT DESCRIPTIONS SBIA ERROR SUMMARY ERROR SUMMARY REGISTER ERRSUM 2008 0008, 2208 0008 SBIA 19 [ STATE |ERAOR CPU____ CONTROL ADDRESS | DETECTED | MACHINE ERROR PARITY |ON ‘ | 777/ ¢ | ERROR i1 14 DMA PARITY ERROR 00 ) 01 i AD LENGTH/STATUS COMMAND 02 21 22 27 29 30 3 SBIA DETECTED DETECTED SBIA T USBIA . MBOX BO DETECTED DETECTED ODETECTED NOTE: BITS <31:24> AND <22:19> AND <17:16> READ ONLY AS ZEROS <05> DMAA TRANSACTION BUFFER SBIA DETECTED CONTROL PARITY ERROR SS30 DMAA CCNTRL ERR This bit is set for a control parity error when the read data is being transferred from transaction buffer A to the SBI during a DMA read. This bit cannot be set if bits <06> or <04> by the CPU This bit is cleared have been previously set. writing it. Register The DMAA Command/Address Register and DMAA 1ID or NXM on will be locked if this bit sets. The setting of this bit will generate a local interrupt. <04> DMAA TRANSACTION BUFFER MBOX DETECTED ERROR SS30 DMAA MBOX ERR This bit is set if the MBox detects a parity error the transfer buffer. of command/address from DMAA transaction the This bit cannot be set if bits <06> or <05> have been This bit is cleared by the CPU writing it. previously set. The DMAA Command/Address Register and DMAA ID Register will be locked if this bit sets. a local <03> interrupt. The setting of this bit will generate | DMAI TRANSACTION BUFFER INTERLOCK TIMEOUT SS30 DMAI TIMEOUT This bit is set if an interlock write masked does not occur within 512 SBI cycles (102.4 microseconds) after an interlock read masked. This bit cannot be set if bits <02>, <01>, or <00> have been previously set. The setting of this bit will generate a local interrupt. This bit is cleared by the CPU writing it. The DMAI Command/Address and DMAI ID Register will | be locked if this bit sets. <02> DMAI TRANSACTION BUFFER SBIA DETECTED A/D PARITY ERROR | SS30 DMAI A/D ERROR This bit is set for a data parity error when the read data is being transferred from transaction buffer I to the SBI during a DMA interlock read. This bit cannot be set if bits <03>, <01>, or <00> have been previously set. This bit is cleared by the CPU writing it. The DMAI Command/Address and DMAI ID Register will be locked if this bit sets. The setting of this bit will generate a local interrupt. L-19 , SBIA STACK FRAME BIT SBIA ERROR SUMMARY DESCRIPTIONS ERASUM ERROR SUMMARY REGISTER 2008 0008, 2208 0008 31 30 29 28 COMMAND g , 02 15 14 . 27 26 25 24 LENGTH/STATUS 0 00 o, 00 13 12 11 10 DMAC SBIA 5 CSB?A B EME@X DETECTED DETECTED DETECTED| A/D PE___CNTRL PE 4 ERROF SBIA 0 DETECTED D PE 22 23 21 BurEER | Ay | SeTRoL BUFFER | A/ LOCK 09 SBIA 08 DETECTED CNTRL ERROR | ERROR 07 MBOX DETECTED ERROR 0 SBIA 20 19 | EaRoR - |5 MBOX DETECTED DETECTED DETECTED | ,ADPE - ,CNTRL PE ; ERROR N SBI SBIA MBOX - | DETECTEC DETECTED OETECTED /D Pf NTRL PE ; ERROR NOTE: BITS <31:24> AND <22:19> AND <17:16> READ ONLY AS ZEROS <01> DMAI SS30 This TRANSACTION BUFFER SBIA DETECTED CONTROL PARITY ERROR DMAI CNTRL ERROR | | bit is set for a control parity error when the read data is being transferred from transaction buffer 1I to the SBI during a DMA interlock read. This bit cannot be set if bits <03:02> or <00> have been previously set. This bit is cleared by the CPU writing it. The DMAI Command/Address Register and DMAI ID Register will of this DMAI locked TRANSACTION BUFFER MBOX local if this bit sets. The setting interrupt. DETECTED ERROR D, J SS30 DMAI MBOX ERR This bit is set if the MBox detects a parity error or NXM on the transfer of command/address from the DMAI transaction buffer. This bit cannot be set if bits <03>, <02>, or <01> have been previously set. This bit 1is cleared by the CPU writing it. The DMAI Command/Address Register and DMAI ID Register will be locked if this bit sets. The setting of this bit will generate a local interrupt. DEDRURDED J <00> be bit will generate a L-20 SBIA STACK FRAME BIT DESCRIPTIONS SBIA DIAGNOSTIC CONTROL REGISTER BIAGNOSTIC CONTROL REGISTER 19 18 16 17 1 FORCE DMA TRANSACTION BUFFER BUSY 4 OMAC . OMAB , 02 1 DMAA DOMAl 00 01 NOTE: BITS <31:20> AND <15:09> READ QONLY AS ZEROS, BITS <08:05> WRITE ONLY <31:20> <19:16> MBZ (SS36) Must ba Zero. FORCE DMA TRANSACTION BUFFER BUSY ' §S29 FORCE DMAC (DMAB, DMAA, DMAI) BUSY These bits are used to direct DMA traffic into specific DMA transaction buffers by forcing other buffers to be busy. The state of these bits has no effect on a DMA transaction already in progress. <15:09> MBZ (SS32) Must be zero. <08> CLEAR SILO ADDRESS | SS19 CLR SILO ADR When this This bit will clear the silo address upon setting. d). (hardwire zero be always will bit register is read, this 07> DISABLE SILO INCREMENT SS29 DISABLE SILO INC from When set, this bit will prevent the silo address This bit will be reset during normal operations incrementing. to allow the silo address to increment. This bit is also read as <06> ~ zero. DIAGNOSTIC DEAD - §S829 DIAG DEAD When this bit sets, it will simulate ABUS DEAD, interrupting ABUS DEAD is normally the console and causing a reboot. asserted by SBI FAIL. <05> MBZ (SS30) Must be zero. <04> DISABLE SBI TIMEOUT SS29 DISABLE SBI TMO This bit is also read as zero. When set for diagnostics, this bit will prevent a timeout condition while waiting for the SBIA to gain control of the SBI, while waiting an acknowledgment from a NEXUS or while | waiting for CPU read data. L-21 SBIA STACK FRAME BIT SBIA DIAGNOSTIC DESCRIPTIONS CONTROL REGISTER e, DIAGCS DIAGNOSTIC CONTROL REBISTER 31 30 29 28 27 26 25 24 23 22 ;\\\g 2008 000C, 2208 000C 21 19 18 17 16 FORCE DMA TRANSACTION BUFFER BUSY 14 13 12 " 10 09 08 7/ 7 07 06 CLEAR 7 /. 05 DIAG % ADDRESS | INCRE DMAC . DOMAB 04 03 02 DISABLE R LooP 7 Tmeout | WOR , DMAA , 01 V, DMAI 00 | ENABLE Mope | PARTY | miMEOUT NOTE: BITS <31:20> AND <15:09> READ ONLY AS ZEROS, BITS <08:05> WRITE ONLY <03> FORCE QUADWORD DATA FORCE QUAD DATA bit is used by microdiagnostics with bit <02> (Loop Back Mode), to provide a way to use a quadclear to loop data back on the SBI. FORCE QUAD DATA is set, then the CPU will execute a quadclear, but for microdiagnostics, the address will be a memory address instead of an SBI address (bit 27 is clear). SS29 This The ABus command/address is the same; it specifies a CPU write to the quadclear register. The ABus write data is the same except for the address. When address, To the SBIA it and it will be looks like a DMA extended write mask handled as such. which will be for a memory (cache) the command/address is transmitted on the SBI it will be received by the SBIA, as it always is, but in this case, the address will be less than the configuration register. to memory For a normal quadclear, the write data is forced to all zeros. In this case, FORCE QUAD DATA will enable the A-Data Assembly Multiplexers to transfer the contents of the Write Data Latch (the ABus Write Data, 1011 and the Quadword Boundary Address) to the SBI. When the second write data longword is transferred to the SBI transceivers, bits 30 and 27 will be toggled (set). This will allow the setting of all data bits on the SBI. <02> LOOP BACK MODE SS29 LOOP BACK MODE | This bit is used by microdiagnostics to allow a CPU read or write to be 1looped back in the SBIA. The PAMM has to be configured such that a memory (cache) address is mapped to an I/0 adapter, and the same PAMM address, but with bits 27 and 28 inverted, is mapped to a memory address. In the SBIA, LOOP BACK MODE will invert address bits 25 and 26, if bit 27 is reset, which will be the case if the CPU write is to a When memory the adapter, longword command CPU writes 25 a memory MBox will write the into the 1like transferred LOOP address, a from BACK MODE and 26 will is be register normal the location the file. CPU that is mapped to an command/address and write The SBIA will carry out write. When command/address set and address inverted. L-22 bit the latch 27 is 1I/0 data command/address to the reset, SBI, the is because address bits ~ | | SBIA STACK FRAME BIT DESCRIPTIONS SBIA DIAGNOSTIC CONTROL REGISTER 19 o /1 FORCE DMA TRANSACTION BUFFER BUSY ] DMAC _ 14 13 16 17 18 DMAB , DMAA , OMAI 12 e NOTE: BITS <31:20> AND <15:09> READ ONLY AS ZEROS, BITS <08:05> WRITE ONLY The command/address, followed by the write data will be transmitted on the SBI. When the SBIA clocks the SBI receivers and looks at the received data, the address is 1less than the memory separator (in the configuration register), it will transfer the command/address and write data to the register file and request MBox service. The MBox will write the data into memory because the address, with bits 27 and 28 inverted (bits 25 and 26 in the SBIA), addresses a different PAMM location. This location 1is mapped | to memory. This diagnostic bit can also be used with a CPU read similar manner to further check out the SBIA logic. <01> in a FORCE STATE PARITY ERROR SS29 FORCE STATE PTY If this bit is set, a state machine parity error will be forced during the CPU ARB WAIT state. <00> ENABLE SHORT TIMEOUT SS29 ENA SHORT TIMEOUT When set, this bit will enable an SBI timeout in 8 instead of the normal 512 cycles. L-23 Read as zero. SBI cycles SBIA STACK FRAME BIT DESCRIPTIONS SBIA DMA COMMAND/ADDRESS REGISTER OMA COMMAND/ADDRESS REGISTER BMAICA DMAI 2008 0010 2208 0010 31 DMAA DMAB 2008 0018 2208 0018 30 2008 0020 2208 0020 29 - DMAC 2008 0028 2208 0028 28 27 26 25 24 23 22 21 20 19 18 ' 17 16 RECEIVED S8l COMMAND/ADDRESS 1 15 14 3 13 i 12 3 1 i 10 g 09 } 08 I 1 07 i 06 i 05 i " 04 1 03 ) 02 3 01 00 RECEIVED SBI COMMAND/ADDRESS N § " 3 1 3 3 1 l i ¥ L i § i NOTE: ALL BITS READ ONLY These four registers (DMAACA, DMABCA, DMACCA, and DMAICA) are used to save information about transactions in the DMA buffers. Each time a command/address is loaded into a DMA buffer, a copy of the command/address and SBI ID of the commander is saved in the two registers corresponding to the DMA buffer. If an error is dectected by the MBox or the SBIA after the transaction is confirmed, the two registers are locked. <31:00> RECEIVED SBI COMMAND/ADDRESS BUS REG <31:00> (SS32, SS33) TM Each time a command/address is loaded into a DMA transaction buffer in the DC022, that command/address is also loaded into the corresponding DMA Command/Address error register. These error registers are actually TTL register files that are addressed by the upper two bits of the DC022 write address. SBI B <31:00> are written in these registers, with bits <31:28> being the SBI command codes and bits <27:00> the longword address. These error registers are locked if the SBIA or MBox detects a DMA error. L-24 SBIA STACK FRAME BIT DESCRIPTIONS SBIA DMA DMAI 0014 2008 2208 0014 ID DMA 1D REGISTER DMAC DMAB DMAA 002C 2008 20080024 2008001C 2208 002C 22080024 2208001C %//////////////// s NOTE: ALL BITS READ ONLY, BITS <31:08> READ AS ZEROS | These four registers (DMAAID, DMABID, DMACID, and DMAIID) are used to save information about transactions in the DMA buffers. Each time a command/address is loaded into a DMA buffer, a copy of the command/address and SBI ID of the commander is saved in If an error the two registers corresponding to the DMA buffer. is dectected by the MBox or the SBIA after the transaction is confirmed, the two registers are locked. TM o~ (SS36) <31:08> MBZ <07:00> RECEIVED SBI IDENTIFICATION Must be zero. BUS REG <07:00> <SS23) Each time a command/address is loaded into a DMA transaction the SBI ID is also loaded into the buffer in the DC022, corresponding DMA ID error register, an extension of the DMA command/address error registers. These error registers are TTL register files like the command/address error registers, and addressed the same way, by the upper two bits of the DCO022 ground <07:05> have the inputs at Bits write address. potential so they will always be read as zero. Bits <04:00> DMA the Like <04:00>. ID are loaded with REC SBI if locked are registers command/address error registers, these the SBIA or MBox detects a DMA error. L-25 | e TM
Home
Privacy and Data
Site structure and layout ©2025 Majenko Technologies