Digital PDFs
Documents
Guest
Register
Log In
MISC-683DDC45
July 1992
10 pages
Original
0.4MB
view
download
OCR Version
0.4MB
view
download
Document:
DECchip 21064-AA Microprocessor Product Brief
Order Number:
MISC-683DDC45
Revision:
Pages:
10
Original Filename:
OCR Text
DECchip 21064-AA Product Brief Microprocessor July, 1992 dilglijtill Features On-chip pipelined floating point Full 64-bit Alpha architecture: - - - - - unit Advanced RISC architecture Optimized for high perform- =~~~ *~ ok vytedataTache o 8K byte instruction cache ance implementations Multiprocessor support IEEE single and double precision, VAX F_ floating and G_floating, longword and quadword data types Cycle counter for code optimization e External cgche memory Support: - - - Optimization for multiple - operating systems Flexible memory manage- - Mult-instruction atomic sequences implementation: - Dual-pipelined architecture - 150 MHz cycle ime Peak instruction execution of 300 million operations per second 12 e.ntry I-stream TB with 8 entries for 8K byte pages and 4 entnies for 4M byte pages 32 entry D-stream TB with each entry able to map 8K, 64K, 512K, or 4M byte pages ment implementations Ultra-high performance Alpha | management unit: Code (PALcode) supports: - Programmable cache size and speed * On-chip demanq paged memory Pnivileged Architecture Library - On-chip external secondary | cache control On-chip parity and ECC generators and checkers e | Intemnal clock generator provides: - - High-speed chip clock Pair of programmable system clocks (CPU/2 to CPU/3) * Programmable on-chip performance counters measure CPU and system performance On-chip write buffer with four e Chip and module level test Selectable data bus width and * 3.3-volt supply voltage 32-byte entries speed: - 64 or 128 bit data width - 75 MHz to 18.75 MHz bus support - - - Lower power Higher reliability Interface to 5-volt logic speed Description Digital’s DECchip 21064A A microprocessor is the first in a family of chips to implement Digital’s Alpha architecture. The DECchip 21064-AA microproces- sor is a .75 micron CMOS based super-scalar super-pipelined processor using dual instruction issue and a 150 MHz cycle time. The Alpha architecture is a 64bit RISC architecture designed with particular emphasis on speed, multiple in- struction issue, multiple processors, and software migration from VAX/VMS and MIPS/ULTRIX operating environments. - DECchip 21064-AA IEEE single precision and double MicroArchitecture preasion floating point data types are supported. VAX F_floating The DECchip 21064-AA microprocessor consists of four independent and G_floating data types are fully - ~=supported with-limited support for functional units: the integer executuon unit (Ebox), floating point unit (Fbox), the load/store or address unit (Abox) and the branch unit. Other sections include the central control unit (Ibox) and the I and D cache. the D_floating data type. Abox - Contains five major sections: address translation data path, load silo, wnte buffer, data cache (Dcache) interface, and the external bus interface unit (BIU). Ebox - Contains a 64-bit fully pipelined integer execution data path including: adder, logic box, barrel shifter, byte extract and mask, and independent integer multiplier. The Ebox also contains a 32-entry 64-bit integer register file. The Abox supports all integer and floating point load and store instructions, including address calculation and translation, and cache control logic. Ibox - Performs instruction fetch, resource checks, and dual instruction Fbox - Contains a fully pipelined floating point unit and independent divider, supporting both IEEE and VAX floating point data types. 1ssue to the Ebox, Abox, Fbox, or branch unit. In addition, the Ibox controls pipeline stalls, aborts and restarts. ICACHE Branch History Table TAG DATA Address Bus EBOX IBOX Muttiplier Prefetch Mu ltiplier/ Logic Bax Conflict Divider Adder J> FBOX oener Adder BiU m Data Bus (128 bits) > Calculation ITB Pipeline IRF g4—| | Control —- FRF ABOX Write | Address Buffer | Generator DTB \ DCACHE TAG DATA Load Sio External Cache Control > Pipeline Organization External Cache - The DECchip 21064-AA supports external cache The DECchip 21064-AA microproc- built from off-the-shelf static essor uses a seven stage pipeline for RAMSs. The DECchip 21064-AA di- -~ ~rectly-controls the RAMSs -using its ence instructions, and a ten stage programmable external cache inter- pipeline for floating point operate face, allowing each implementation instructions. The Ibox maintains to make its own external cache speed state for all pipeline stages to track and configuration trade-offs. outstanding register writes. The external cache interface supports Cache Organization cache sizes from 0 to 8M bytes and a range of operating speeds which The DECchip 21064-AA microproc- are sub-multiples of the chip clock. essor contains two on-chip caches, data cache (Dcache) and instruction Virtual Address Space cache (Icache). The chip also supports an external cache. The virtual address 1s a 64-bit unsigned integer that specifies a byte Dcache - Contains 8K bytes and is a location within the virtual address write through, direct mapped, read- space. The DECchip 21064-AA mi- allocate physical cache with 32-byte croprocessor checks all 64-bits of a blocks. virtual address and implements a 43- bit subset of the address space. The Icache - Contains 8K bytes and is a DECchip 21064-AA supports a physical direct-mapped cache with physical address space of 16G bytes. 32-byte blocks. Characteristics Power Supply Vss0.0V,Vdd33V +5% Operating Temperature (with proper 0°C to 70°C heatsink and airflow) Storage Temperature Range -55°Cto 125°C Power Dissipation @Vdd = 3.45V 23 W typical, 27.5 W maximum Speed = 6.6 ns Alpha Architecture Branch Instructions - Conditional Summary branch instructions test a register for positive/negative, zero/nonzero, or —The DECchip 21064-AA_microprac- essor implements the Alpha architec- ture. The Alpha architecture supports: A fixed 32-bit instruction size Separate integer and floating point registers - 32 64-bit integer registers - 32 64-bit floating point even/odd, and perform a PC relative “branch. Unconditional branch instructions perform either a PC relative or absolute jump using an arbitrary 64-bit register value. They can update a destination register with a return address. Load/Store Instructions - can move either 32-bit or 64-bit quantities. registers 32-bit (longword) and 64-bit (quadword) integer along with 32-bit and 64-bit IEEE and VAX . floating-point data types Memory access using a 64-bit virtual byte address Pnivileged Architecture Library Code (PALcode) 8-bit and 16-bit load/store operations are supported through an extensive set of in-register byte mamipulations. Integer Operate Instructions manipulate full 64-bit values, and include a full complement of arithmetic, compare, logical, and shift instructions. In addition there are three 32-bit integer operates: add, Instruction Set subtract, and muluply. Instructions are all 32 bits in length using four different instruction formats specifying 0, 1, 2, or 3 5-bit register fields. Each format uses a 6- bit opcode. In addition to the operation of conventional RISC architectures, the Alpha architecture provides scaled add/subtract for quick subscript calculation, 128-bit multiply for division by a constant oP Number CALL_PAL OP | RA Displacement Branch OP | RA | RB| Displacement OP Function | RA| RB Memory RC | Operate CALL_PAL Instructions - vector to a pnivileged library of software that atomically performs both privileged and unprnivileged functions. and multa-precision anithmetic, conditional moves for avoiding branches, and an extensive set of in-register byte manipulatnon instructions. Floating-Point Operate Instruc- Memory Management tions - include four complete sets of instructions for IEEE single, IEEE The Alpha memory management double, VAX F_floating and VAX architecture is designed to provide: - --floaang-arithmetic. In.addition {0 arithmetic instructions there are also e instructions for conversions between floating and integer values including for instructions and data e Convenient and efficient sharing e Independent read and write ac- e Flexibility through programma- the VAX D_floating data type. of instructions and data. Privileged Architecture Library Code A large address space cess protection ble PAL code support PAlLcode is a privileged library of software that atomically performs such functions as the dispatching and servicing of interrupts, exceptions, task switching, and additional privileged and unpnivileged user instructions as specified by operating systems using the CALL_PAL instruction. PAlLcode is the only method of performing some operatons on the hardware. In addition to the entire instruction set, a set of implementation specific instrucuons 1s provided. PAlLcode runs in an environment with privileges enabled, instruction stream mapping disabled, and interrupts disabled. Disabling memory mapping allows PAILcode to support functons such as TB miss routines. Disabling interrupts allows the instruction stream to provide mult-instruction sequences as atomic operations. Alpha Architecture Compared to Conventional RISC Architecture The Alpha architecture is different from conventional RISC architectures in a number of ways: ey . [ ey provoany Feature Difference 64-Bi1t Architecture True 64-bit architecture with 64-bit data and address. Not a 32-bit architecture that was later expanded to 64 bits. High Speed The Alpha architecture was designed to allow very high-speed implementations. Simple instructions make it particularly easy to build implementations that issue multiple instructions every CPU cycle. There are no implementation specific pipeline timing hazards, no load delay slots, and no branch delay slots. Multiprocessor Support The Alpha-architecture does not enforce strict read/write ordering between multiple proc- essors. This allows multiprocessor implementations to easily use features such as: multbank caches, bypassed write buffers, write merging, and pipelined writes with retry on error. To maintain strict ordering between accesses as seen by a second processor, memory barrier instructions can be explicitly inserted in the program. The basic mult- processor interlocking pnmitive 1s a RISC style load_locked, modify, store_conditional sequence. If the sequence runs without interrupt, exception, or an interfering write from another processor, the store succeeds. Otherwise, the store fails and the program eventually must branch back and retry the sequence. Multiple Operating The Alpha architecture provides flexibility by allowing the user to implement a privileged Systems library of software for operating system specific operations. This allows Alpha to run full VMS using one version of this software library that mirrors many of the VAX operating system features, and to run OSF/1 using a different version that mirrors many of the MIPS operating system features. Additional operating system implementations can be efficiently supported. Byte Manipulation The Alpha architecture is unconventional in the approach to byte manipulation. Byte loads, stores, and operations are done with normal 64-bit instructions, crafted to keep the sequences short. Single-byte stores found in conventional RISC architectures force cache and memory implementations to include hardware byte operabons and implement readmodify-write cycles which can complicate system design and reduce performance. Anthmetic Traps In contrast to conventional RISC architectures, the reporting of Alpha architecture arithmetc traps (overflow, underflow, and others) are imprecise. This removes architectural bottlenecks that affect performance. If precise arithmetic exceptions are desired, trap bar- rier instructions can be explicitly inserted in the program to force traps to be dehivered at specific points. Alpha architecture includes a number of implementation-specific HINTS aimed at allow- -ing higher performance. Software is able to.provide HINTS to the hardware that enable the hardware to optimize its operation. HINTS can help improve the utilization of the pipeline, cache memory, and translation lookaside buffers. Signals Name Type Function adr_h 33:5 Input/Output Address bus data_h 127:0 Input/Output Data bus check_h 27:0 Input/Output Check bit bus dOE_l Input Data bus output enable Data bus write data select dWSel_h 1:0 Input dRAck h2:0 Input Data bus data acknowledge tagCEOQOE_h Output External cache RAM tagCitl, tagAdr CE/OE tagCdWE_h Output External cache RAM tagCtl WE tagCdV_h Input/Output Tag valid tagCtS_h Input/Output Tag shared tagCdD_h Input/Output Tag dirty tagCtP_h Input/Output Tag V/S/D parity tagAdr_h 33:17 Input Tag address tagAdrP_h Input Tag address parity tagOK_h,_1 Input Tag access from CPU is ok tagEq | Output Tag compare output dataCEOE_h 3:0 Output External cache RAM data CE/OE, longword dataWE_h 3:0 Output External cache RAM data WE, longword dataA_h 4.3 holdReq_h Output Input External cache RAM data A 4:3 Hold request holdAck_h Output Hold acknowledge cReq h 2:0 Output Cycle request cWMask_ h7:0 OQOutput Cycle write mask cAck_h 2:0 Input Cycle acknowledge 1Adr_h 12:5 Input Invalidate address, Dcache dinvReq h Input Invalidate request, Dcache dMapWE_h Output External Dcache duplicate tag RAM WE irg_h 5:0 Input Interrupt request sRomOE_] Output Senial ROM output enable sRomD_h sRomclk_h Input Output Serial ROM data/Rx data Serial ROM clock/Tx data vRef Input Input reference eclOut_h Input Output mode selection perf_cnt_h 1:0 Input Performance counter inputs threestate_1 icMode_h 1:0 cont_] Input Input Input Three state for testing Icache Test Mode Selection Continuity for testing clkin_h, 1 testClkIn_h, 1 Input Clock 1nput Input Clock 1nput for testing cpuClkOut_h Output CPU clock output sysClkOutl_h,_1 Output System clock output, normal sysClkOut2_h,_1 Output System clock output, delayed dcOk_h Input Power and clocks ok reset_l Input Reset Packaging OA0OA0OM0YO0WOV0UOT0RO0PONL0MOLKOJLOHLGOFLEODOCBOAO OOTOC000OO000O000O000OIt00OO00OO00OO0OO00OO0OOOOOOX®0OOLVNOOOONOOOXOO 431 Pin Grid Array O000OOOOOLOLOLOLOOO0O0O0O0O0O0O0O0O0O0O0O0O0OLOLOOO O0O0O0O0O0 o00 OO0000O0 XONONOROX®); OO0OO000O0 NONOXOXOXO) ONONONOXNOXO) ONONONOXOX®) CXONOXOXOX® OO0O0O0 O0O0O0O0O0 21064-AA Top View (Pin Down) 24 2322212019 1817 16 1514 1312 1110 09 08 07 06 05 04 03 02 O1 Package Dimensions 2.400 .461 0000 i 0oon 0000 gooo 0000 00 00 00 00 0ooao 0000 0000 oood goon ooodo 2.400 Information For more information on Digital’s DECchip 21064-AA Microprocessor - call: 1-800-DEC-2717 1-800-DEC-2515 TTY Orders may be placed through Digital’s Technical OEM (TOEM) Sales Representatives. Call your local Digital Sales Office for details. dligitally The information in this document is subject to change without notice and should not be construed as a commitment by Digital Equipment Corporation. Digital Equipment Corporation assumes no responsibility for any errors that may appear in this document. Copyright © Digital Equipment Corporation 1992 All Rights Reserved Printed in U.S.A. _The following are trademarks of Digital Equipment Corporation: 21064, Digital, ULTRIX,VAX VMS, and the Dagital logo. OSF and OSF/! are trademarks of the Open Software Foundation, Inc.
Home
Privacy and Data
Site structure and layout ©2025 Majenko Technologies