Digital PDFs
Documents
Guest
Register
Log In
AA-M400A-TK
May 1983
292 pages
Original
6.9MB
view
download
Document:
Common Math Library Reference Manual Sep83
Order Number:
AA-M400A-TK
Revision:
0
Pages:
292
Original Filename:
AA-M400A-TK_Common_Math_Library_Reference_Manual_Sep83.pdf
OCR Text
TOPS-10/TOPS-20 Common Math Library Reference Manual Order No. AA-M400A-TK September 1983 Abstract This manual describes the mathematical routines that constitute the TOPS-10ITOPS-20 Math Library. OPERATING SYSTEM: TOPS-20 Version 5.0 and 5.1 TOPS-10 Version 7.01A SOFTWARE: FORTRAN-10/20 Version 7 Pascal-10/20 Version 1 Software and manuals should be ordered by title and order number. In the United States. send orders to the nearest distribution center. Outside the United States. orders should be directed to the nearest DIGITAL Field Sales Office or representative. Western Region Northeast/MId-Atlantic Region Central Region Digital Equipment Corporation PO Box CS2008 Nashua. New Hampshire 03061 Telephone:(603)884-6660 Digital Equipment Corporation Digital Equipment Corporation Accessories and Supplies Center Accessories and Supplies Center 1050 East Remington Road 632 Caribbean Drive Sunnyvale. California 94086 Schaumburg. Illinois 60195 Telephone:(408)734-4915 Telephone:(312)64D-5612 dl Ital e UI ment cor oration. marlboro r1iossochusctts First Printing, September 1983 The information in this document is subject to change without notice and should not be construed as a commitment by Digital Equipment Corporation. Digital Equipment Corporation assumes no responsibility for any errors that may appear in this document. The software described in this document is furnished under a license and may only be used or copied in accordance with the terms of such license. No responsibility is assumed for use or reliability of software on equipment that is not supplied by DIGITAL or its affiliated companies. Copyright © 1983 by Digital Equipment Corporation All Rights Reserved The postage··prepaid READER'S COMMENTS form on the last page of this document requests the user's critical evaluation to assist us in preparing future documentation. The following are trademarks of Digital Equipment Corporation: DEC DECUS Digital Logo PDP UNIBUS VAX DECnet DECSYSTEM-lO DECSYSTEM-20 DECwriter DIBOL EduSystem lAS MASSBUS PDT RSTS RSX VMS VT Contents Page Chapter 1 Introduction 1.1 1.2 1.3 1.4 1.5 The Math Library . Math Symbols and Names used in Equations. Data Types and Their Precision 1.3.1 Integer. 1.3.2 Single-Precision, Floating-Point . 1.3.3 Double-Precision, D-Floating-Point 1.3.4 Double-Precision, G-Floating-Point 1.3.5 Complex. 1.3.6 Complex, Double-Precision Information About the Routines 1.4.1 Calling Sequence . 1.4.2 Entry Points . 1.4.3 Return Location 1.4.4 Register Usage. Accuracy Tests . 1-:~ . 1-9 1-10 1-10 I--H) 1-11 1--11 1-12 1-12 1-12 1-1~~ 1-13 1-13 1-13 1-14 Chapter 2 Square Root Routines SQRT . . DSQRT. GSQRT. CSQRT. CDSQRT CGSQRT · 2-3 · 2-5 · 2--7 .2-9 2-11 2-13 Chapter 3 Logarithm Routines ALOG . . ALOG10. DLOG . DLOG10 GLOG . GLOG10 CLOG . . CDLOG. CGLOG. · 3-:3 · 3-5 .3-7 .3-9 3-11 3-13 3-15 3-17 3-19 iii Chapter 4 Exponential and Exponentiation Routines EXP . DEXP . GEXP . CEXP . . CDEXP. CGEXP. EXPl. . EXP2 . . DEXP2 .. GEXP2 .. CEXP2 .. EXP3 . . DEXP3 .. GEXP3 .. CEXP3 .. . . .4-3 .4-5 .4-7 .4-9 4-11 4-13 4-15 4-16 4-18 4-20 4-22 4-25 4-28 4-31 4-34 Chapter 5 Trigonometric Routines SIN .. SIND. COS. COSD. DSIN. DCOS. GSIN. GCOS. CSIN. CCOS. CDSIN CDCOS. CGSIN . CGCOS. TAN . . COTAN. DTAN . DCOTAN. GTAN . . GCOTAN. .5-3 .5-5 .5-7 .5-9 5-11 5-13 5-15 5-17 5-19 5-21 5-23 5-25 5-27 5-29 5-31 5-33 5-35 5-37 5-39 5-41 Chapter 6 Inverse Trigonometric Routines ASIN . . ACOS . . DASIN . DACOS. GASIN . GACOS. ATAN . ATAN2 . DATAN. DATAN2 GATAN. GATAN2 Iv .6-3 .6-4 .6-5 .6-7 .6-9 6-11 6-13 6-15 6-17 6-19 6-21 6-23 Chapter 7 Hyperbolic Routines SINH . . COSH . . DSINH. DCOSH. GSINH . GCOSH . . TANH. DTANH. GTANH. · 7-3 .7-4 · 7-5 · 7-7 .7-8 7-10 7-11 7-12 7-13 Chapter 8 Random Number Generating Routines RAN . . . RANS . . . SETRAN . SAVRAN . .8-3 .8-5 .8-6 .8-7 Chapter 9 Absolute Value Routines lABS. ABS . DABS. GABS. CABS. CDABS. CGABS. .9-3 .9-4 .9-5 .9-6 .9-7 .9-8 .9-9 Chapter 10 Data Type Conversion Routines IFIX . INT .. IDINT GFX.n. REAL. FLOAT. SNGL . . GSN.n . DFLOAT .. DBLE . . GTOD . GTODA. GFL.n . GDB.n . DTOG . DTOGA. CMPL.I. CMPLX. CMPL.D CMPL.G CMPL.C 10-3 10-4 10-5 10-6 10-7 10-8 10-9 · 10-10 · 10-11 · 10--12 · 10-13 · 10-14 · 10-15 · 10-16 · 10-17 · 10-18 · 10-19 · 10-20 · 10-21 · 10-22 · 10-23 v Chapter 11 Rounding and Truncation Routines NINT . . IDNINT. IGNIN .. ANINT. DNINT. GNINT .. AINT. DINT . . GINT. . 11-3 11-4 11-5 11-6 11-7 11-8 11-9 · 11-10 · 11-11 Chapter 12 Product, Remainder, and Positive Difference Routines DPROD. GPROD. MOD. AMOD DMOD GMOD IDIM. DIM. DDIM. GDIM. 12-3 12-4 12-5 12-6 12-7 12-8 12-9 · 12-10 · 12-11 · 12-12 Chapter 13 Transfer of Sign Routines ISIGN . . SIGN . . DSIGN . GSIGN . 13-3 13-4 13-5 13-6 Chapter 14 Maximum/Minimum Routines MAXO . MAXI. AMAXO. AMAXI. DMAXI. GMAXI. MINO . . MINI . . AMINO. AMINI. DMINI . GMINI . 14-3 14-4 14-5 14-6 14-7 14-8 14-9 · 14-10 · 14-11 · 14-12 · 14-13 · 14-14 Chapter 15 Miscellaneous Complex Routines REAL.C. AIMAG. CONJ. CFM. CFDV. vi 15-3 15-4 15-5 15-6 15-7 Appendix A ELEFUNT Test Results Appendix B Using the Common Math Library with MACRO Programs Tables 1-1 1-2 Math Library Routines. . . . . . . . . . . . . . . . . . . . . . . . . 1-4 Comparison of Single-Precision, D-Floating-Point, and G-Floating-Point 1-11 vii Preface This manual describes the TOPS-I0/TOPS-20 Common Math Library. At present, the library is included as part of each object-time system of each language that uses it. In the future, the library will be a separate entity as described in this manual. Chapter 1 introduces the library routines and gives information on how they are described. A table of the routines, arranged in alphabetical order, is included for easy reference. Chapters 2 through 15 contain the descriptions of the routines, grouped logically such that all like routines are together (e.g., all the square root routines are in Chapter 2). Appendix A gives the results of the ELEFUNT tests and Appendix B describes error handling for MACRO programs. Ix Chapter 1 Introduction 1.1 The Math Library The TOPS-I0/TOPS-20 Common Math Library contains a set of routines that perform the following mathematical functions for several types of data. • square root • natural and base-l0 logarithm • exponential and exponentiation • trigonometric • inverse trigonometric • hyperbolic • random number generation • absolute value • data type conversion • rounding and truncation • product • remainder • positive difference • transfer of sign • maximum or minimum of a series • complex conjugate • complex multiplication or division Most of the routines are functions; but some, notably the complex doubleprecision, are subroutines. The difference between the types of routines is the way in which they are called from a program. Consult the applicable language manual for more information. The routines are listed alphabetically in Table 1-1 with a short description of each and a page reference. Introduction 1-3 Table 1-1: Math Library Routines Routine Name Page Purpose ABS 9-4 absolute value ACOS 6-4 arc cosine AIMAG 15-4 imaginary part of complex number AINT 11-9 truncation to integer ALOG 3-3 natural logarithm ALOGlO 3-5 base-lO logarithm AMAXO 14-5 largest of a series AMAXI 14-6 largest of a series AMINO 14-11 smallest of a series AMINI 14-12 smallest of a series AMOn 12-6 remainder ANINT 11-6 nearest whole number ASIN 6-3 arc sine ATAN 6-13 arc tangent ATAN2 6-15 polar angle of a point in the x-y plane CABS 9-7 complex absolute value CCOS 5-21 complex cosine CDABS 9-8 complex, double-precision, D-floating-point absolute value CDCOS 5-25 complex, double-precision, D-floating-point cosine CDEXP 4-11 complex, double-precision, D-floating-point exponential CDLOG 3-17 complex, double-precision, D-floating-point natural logarithm CDSIN 5-23 complex, double-precision, D-floating-point sine CDSQRT 2-11 complex, double-precision, D-floating-point square root CEXP 4-9 complex exponential CEXP2. 4-22 exponentiation of a complex number to the power of an integer CEXP3. 4-34 exponentiation of a comple?, number to the power of another complex number CFDV 15-7 complex division CFM 15-6 complex multiplication CGABS 9-9 complex, double-precision, G-floating-point absolute value CGCOS 5-29 complex, double-precision, G-floating-point cosine (continued on next page) 1-4 TOPS-10/TOPS-20 Common Math Library Reference Manual Table Table 1-1 (Cont.): Math Library Routines Routine Name Page Purpose CGEXP 4-13 complex, double-precision, G-floating-point exponential CGLOG 3-19 complex, double-precision, G-floating-point natural logarithm CGSIN 5-27 complex, double-precision, G-floating-point sin CGSQRT 2-13 complex, double-precision, G-floating-point square root CLOG 3-15 complex natural logarithm CMPL.C 10-23 conversion of two complex numbers to one complex number CMPL.D 10-21 conversion of two double-precision, D-floating-point numbers to complex format CMPL.G 10-22 conversion of two double-precision, G-floating-point numbers to complex format CMPL.I 10-19 conversion of two integers to complex format CMPLX 10-20 conversion of two single-precision numbers to complex format CONJ 15-5 complex conjugate COS 5-7 cosine (angle in radians) COSD 5-9 cosine (angle in degrees) COSH 7-4 hyperbolic cosine COTAN 5-33 cotangent CSIN 5-19 complex sine CSQRT 2-9 complex square root DABS 9-5 double-precision, D-floating-point absolute value DACOS 6-7 double-precision, D-floating-point arc cosine DASIN 6-5 double-precision, D-floating-point arc sine DATAN 6-17 double-precision, D-floating-point arc tangent DATAN2 6-19 double-precision, D-floating-point polar angle of a point in the x-y plane DBLE 10-12 conversion from single-precision to dou bIe-precision, D-floating-point format DCOS 5-13 double-precision, D-floating-point cosine DCOSH 7-7 double-precision, D-floating-point hyperbolic cosine DCOTAN 5-37 double-precision, D-floating-point cotangent DDIM 12-11 double-precision, D-floating-point positive difference DEXP 4-5 double-precision, D-float.ing-point exponent.ial (continued on next page) Introduction 1-5 Table 1-1 (cont.): Math Library Routines Routine Name Page Purpose DEXP DEXP2. 4-5 4-18 double-precision, D-floating-point exponential DEXP3. 4-28 exponentiation of a double-precision, D-floating-point number to the power of another double-precision, D-floating-point number DFLOAT 10-11 conversion of an integer to double-precision, D-floating-point format DIM 12-10 positive difference DINT 11-10 double-precision, D-floating,point truncation DLOG 3-7 double-precision, D-floating-point natural logarithm DLOG10 double-precision, D-floating-point base-lO logarithm DMAX1 3-9 14-7 double-precision, D-floating-point largest in a series DMIN1 14-13 double-precision, D-floating-point smallest in a series DMOD 12-7 double-precision, D-floating-point remainder DNINT DPROD DSIGN DSIN 11-7 12-3 double-precision, D-floating-point nearest whole number double-precision, D-floating-point product 13-5 double-precision, D-floating-point transfer of sign 5-11 double-precision, D-floating-point sine DSINH 7-5 double-precision, D-floating-point hyperbolic sine DSQRT 2-5 double-precision, D-floating-point square root DTAN 5-35 double-precision, D-floating-point tangent DTANH 7-12 double-precision, D-floating-point hyperbolic tangent DTOG 10-17 conversion of a double-precision, D-floating-point number to double-precision, G-floating-point format DTOGA 10-18 conversion of an array of double-precision, D-floating-point numbers to double-precision, G-floating-point format EXP EXP1. 4-3 4-15 exponential EXP2. 4-16 exponentiation of a single-precision number to the power of an integer EXP3. 4-25 exponentiation of a single-precision number to the power of another single-precision number FLOAT 10-8 conversion of an integer to single-precision format GABS 9-6 double-precision, G-floating-point absolute value GACOS 6-11 double-precision, G-·floating-point arc cosine GASIN 6-9 double-precision, G-floating-point arc sine GATAN 6-21 double-precision, G-floating-point arc tangent exponentiation of a double-precision, D-floating-point number to the power of an integer exponentiation of an integer to the power of another integer (continued on next page) 1-6 TOPS-10/TOPS-20 Common Math Library Reference Manual Table 1-1 (cont.): Math Library Routines Routine Name Page Purpose GATAN2 6-23 double-precision, G-floating-point polar angle of a point in the x-y plane GCOS GCOSH GCOTAN GDB.n 5-17 double-precision, G-floating-point cosine 7-10 5-41 double-precision, G-floating-point hyperbolic cosine 10-16 conversion of a single-precision number to double-precision, G-floating-point format GDIM 12-12 double-precision, G-floating-point positive difference GEXP GEXP2. 4-7 double-precision, G-floating-point exponential 4-20 exponentiation of a double-precision, G-floating-point number to the power of an integer GEXP3. 4-31 exponentiation of a double-precision, G-floating-point number to the power of another double-precision, G-floating-point number GFL.n 10-15 conversion of an integer to double-precision, G-floating-point format GFX.n 10-6 conversion of a double-precision, G-floating-point number to integer format GINT. GLOG GLOGI0 11-11 double-precision, G-floating-point truncation 3-11 3-13 double-precision, G-floating-point natural logarithm GMAXI 14-8 14-14 double-precision, G-floating-point largest of a series GMOD GNINT. 12-8 double-precision, G-floating-point remainder 11-8 dquble-precision, G-floating-point nearest whole number GPROD. GSIGN double-precision, G-floating-point product GSIN GSINH GSN.n 12-4 13-6 5-15 7-8 10-10 GSQRT GTAN GTANH 2-7 5-39 7-13 GTOD 10-13 conversion of a double-precision, G-floating-point number to double-precision, D-floating-point format GTODA 10-14 copversion of an array of double-precision, G--floating-point nqmbers to double-precision, D-floating-point format GMIN1 double-precision, G-floating-point cotangent double-precision, G-floating-point base-lO logarithm double-precision, G-floating-point smallest of a series dciuble-precision, G-floating-point transfer of sign double-precision, G-floating-point sine double-precision, G-floating-point hyperbolic sine conversion of a double-precision, G-floating-point number to single-precision format double-precision, G-floating-point square root double-precision, G-floating- point tangent double-precision, G-floating-point hyperbolic tangent (continued on next page) Introduction 1-7 Table 1-1 (cont.): Math Library Routines Routine Name Page Purpose lABS 9-3 integer absolute value IDIM 12--9 integer positive difference IDINT 10--5 conversion of a double-precision, D-floating-point number to integer format IDNINT 11-4 integer nearest whole number for a double-precision, D-floating-point number IFIX 10-3 conversion of a single-precision number to integer format IGNIN. 11-5 integer nearest whole number for a double-precision, G-floating-point number INT 10-4 conversion of a single-precision number to integer format ISIGN 13-3 integer transfer of sign MAXO 14-3 largest of a series MAXI 14-4 largest of a series MINO 14-9 smallest of a series MINI 14-10 smallest of a series MOD 12-5 integer remainder NINT 11-3 integer nearest whole number for a single-precision number RAN 8-3 random number generator RANS 8-5 random number generator with shuffling REAL 10-7 conversion of an integer to single-precision format REAL.C 15-3 real part of a complex number SAVRAN 8-7 save the seed for the last random number generated SETRAN 8-6 set the seed value for the random number generator SIGN 13-4 transfer of sign SIN 5-3 sine (angle in radians) SIND 5-5 sine (angle in degrees) SINH 7-3 hyperbolic sine SNGL 10-9 conversion of a double-precision, D-floating-point number to single-precision format SQRT 2-3 square root TAN 5-31 tangent TANH 7-11 hyperbolic tangent The routines in this library are available to most of the languages available with TOPS-10 and TOPS-20. Consult the applicable language manual for specific information on how to use the Math Library_ Although all of the routines listed in Table 1-1 exist in the library, not all of them can be called from all languages. That is, some languages or compilers have restrictions that disallow calling of a particular routine from a user program. For example, 1-8 TOPS-10/TOPS-20 Common Math Library Reference Manual the complex data type does not exist in PASCAL, so the routines that perform complex mathematics are never called by a PASCAL program. However, a compiler may itself call a routine because a user program has a statement that necessitates use of a Math Library routine. For example, a FORTRAN program cannot call any of the routines whose names contain a period (.). However, the compiler recognizes when a statement within a program requires use of one of those routines, and the compiler calls the appropriate routine. Similarly, a statement in an APL program may require a mathematical function, so the APL interpreter translates that statement into a call to the appropriate Math Library routine. 1.2 Math Symbols and Names Used In Equations Throughout this manual, certain mathematical symbols and names are used to indicate values, quantities, actions, or states. These symbols and their meanings are listed below. + x / > ~ < ~ 7r ± [] II eX sin cos tan cot sin- 1 cos- 1 tan- 1 sinh cosh tanh sgn conj equal to plus minus multiplied by (used in equations) multiplied by (used in numbers) divided by greater than greater than or equal to less than less than or equal to not equal to square root Pi (3.14159265358979323846264950338327) plus or minus greatest integer in absolute value equals approximately subscript superscript or raised to the power natural logarithm base-10 logarithm imaginary number (yCi) exponential sine of an angle cosine of an angle tangent of an angle cotangent of an angle arc sine arc cosine arc tangent hyperbolic sine hyperbolic cosine hyperbolic tangent sign of complex conjugate Introduction 1-9 In addition, some equations use the names of routines to indicate a state or action. These routines and their meanings are as follows. FLOAT convert and round from an integer to a single-precison, floatingpoint number INT convert and truncate from a single-precision, floating-point number to an integer MAX largest of a series MIN smallest of a series MOD remainder Each of these routines is described in detail in this manual. Also, machine infinity (or infinity) is a term used to indicate the largest or smallest number representable in the machine. + machine infinity = 3777777777778 for single-precision 377777777777, 3777777777778 for dou ble-precision -machine infinity = 4000000000008 for single precision 400000000000, 0000000000018 for double-precision 1.3 Data Types and Their Precision The Common Math Library routines can handle several data types - integer; single-precision, floating-point (also called real); double-precision, D-floatingpoint; double-precision, G-floating-point; complex; complex, double-precision, D-floating-point; and complex, double-precision, G-floating-point. Each data type is described in detail in one of the following sections. 1.3.1 Integer An integer value is a string of one to eleven digits that represents a whole decimal number (a number without a fractional part). Integer values must be within the range of _2 3f:i to +235 _1 (-34359738368 to +34359738367). 1.3.2 Single-Precision, Floating-Point Single-precision, floating-point values may be of any size; however, each will be rounded to fit the precision of 27 bits (7 to 9 dedmal digits). Precision for single-precision, floating-point values is maintained to approximately eight significant digits; the absolute precision depends upon the numbers involved. The range of magnitude permitted a single-precision, floating-point value is from approximately 1.47x10-39 to 1.70x10+38 • 1-10 TOPS-10/TOPS-20 Common Math Library Reference Manual 1.3.3 Double-Precision, D-Floating-Point Double-precision, D-floating-point values are similar to single-precision, floating-point values; the differences between these two values are: • Double-precision, D-floating-point values, depending on their magnitude, have precision of 62 bits, rather than the 27-bit precision obtained for single-precision, floating-point values. • Each double-precision, D-floating-point value occupies two storage locations. The range of magnitude permitted a double-precision, D-floating-point value is from approximately 1.47x10-39 to 1.70x10-f:~8. 1.3.4 Double-Precision G-Floating-Point 1 Double-precision, G-floating-point values are similar to double-precision, D-floating-point values. They differ in: • the number of bits of exponent • the number of bits of mantissa • the range of numbers they can represent • the digits of precision Table 1-2 summarizes the differences among single-precision and the two forms of double-precision. Table 1-2: Comparison of Single-Precision, D-Floatlng-Polnt, and G-Floatlng-Polnt Digits of Precision Bits of Exponent Bits of Mantissa Range single-precision 8 27 1. 47x10- 39 . to 1. 70xlO+ 38 8.1 D-floating-point 8 62 1. 47x 10-::19 to 1. 70x10+:~8 18.7 G-floating-point 11 fi9 2.78xlO--:m9 to 8.99xlO+:l<17 17.8 1 Double-precision, G-floating-point data type is available only with TOPS-20 Version 5 (or later) on the DECSYSTEM-20 KLlO model B. Introduction 1-11 1.3.5 Complex A complex value contains two numbers; it is assumed that the first (leftmost) value of the pair represents the real part of the number and that the second value represents the imaginary part of the number. The values that represent the real and imaginary parts of a complex value occupy two consecutive storage locations. 1.3.6 Complex, Double-Precision You can use two types of complex, double-precision values - D-floating-point and G-floating-point. Both are assumed to be double-precision arrays with two elements. The first element is the real part, and the second element is the imaginary part. 1.4 Information About the Routines Each routine described in this manual has the following information provided. • A short description • The names of other routines called by the routine • The data type and range of the argument(s) • The data type and range of the result • The accuracy of the result • The algorithm used to calculate the result • A reference to any text used for information about the algorithm (where applicable) • Any error conditions and the messages that result Some additional information about the routines not included in each write-up 1S: • Calling sequence • Entry points • Return location(s) • Register usage This information is described below. It is not included for each routine because it is identical for most routines and is relevant only for MACRO and BLISS users. 1-12 TOPS-10/TOPS-20 Common Math Library Reference Manual 1.4.1 Calling Sequence Most routines are called by an identical calling sequence. This calling sequence is: XMOVEI PUSHJ L,ARG P, routine-name ARG is the address of the argument block. L is the pointer to the argument list for the routine; it is ACI6. P is the stack pointer; it is ACI7. Note that the contents of L (ACI6) are not preserved. For example, the SQRT routine is called by: XMOVEI PUSHJ 16,ARG 17,SQRT Those routines called by a different calling sequence contain the calling sequence in their descriptions. 1.4.2 Entry Points In most cases each routine has at least two entry points - its name and its name followed by a period. For example, SQRT and SQRT. are entry points for the SQRT routine. The name with the period is the one used by the FORTRAN compiler. Some routines have additional entry points because they perform more than one function. Thus, one routine calculates both sine and cosine, so SIN, SIN., COS, and COS. are all entry points into that routine. If you are calling a routine from a MACRO or BLISS program, you can use the name of the routine as the entry point; it will always work. 1.4.3 Return Location The result of the calculation of most routines is returned to one or two registers. For integer and single-precision results, the return location is register O. For double-precision and complex (single-precision) results, the return locations are registers 0 and 1. For complex, double-precision results, the return location must be specified as the second argument included in the call to the routine. The requirements for the arguments included in the call are included with each write-up of the complex, double-precision routines. 1.4.4 Register Usage All the routines have similar register usage. Some may use more registers than others, however. As stated above, registers 0 and 1 are used for the return locations; therefore the original contents of one or both are lost on return from a routine. These registers are also occasionally used to store the argument initially. Registers 2 through 15 are saved, used, and restored. The number of such registers used depends on the routine. Introduction 1-13 1.5 Accuracy Tests Each routine contains a section headed "Accuracy of Result." The accuracy figures were obtained from the tests described below. These tests were run with typical values for arguments. There may be unusual arguments that could cause larger errors; for example, if you get too close to a threshold that could cause overflow or underflow, larger errors can occur. The format of the accuracy section is as follows. Note that the elements are explained with the descri ptions of the tests. Accuracy of Result test interval: 0.00000 through 1.0000 MRE: 1.55x10-8 (25.9 bits) RMS: 3.76x10-9 (28.0 bits) LSB error distribution: -2 -1 o +1 0% 8% 83% 9% +2 0% To test a routine, several representative intervals for each routine were chosen. Sample values were then chosen randomly from each interval, approximately 200,000 for single-precision and 20,000 for double-precision. Each routine was then called using these values. The relative error of each result was then obtained by the following equation. For example: result - result of rO\ldtinz I ,~ctual exact actual exact result I sinix) -~, SIN (x) sin(x) I I The test computed the maximum relative error (MRE) and the average relative error, called the root mean square (RMS). To interpret the MRE and RMS, consider an "exact" routine, one that always returns an exact result rounded to machine precision. Such a routine would show a maximum relative error of 2- 27 for single-precision; 2- 62 for double-precision, D-floatingpoint; and 2-59 for double-precision, G-floating-point. To make the MRE and RMS more understandable in terms of bits of accuracy, the tests also give the number of bits of accuracy by finding the negative base-2 logarithm of the MRE and RMS. For the "exact" routine, the negative base-2 logarithm of the MRE would be 27 for single-precision; 62 for double-precision, D-floatingpoint; and 59 for double-precision, G-floating-point. The negative base-210garithm of the RMS error from an "exact" routine would be about 28.3, 63.3, and 60.3, respectively. These numbers are slightly larger than those for the 1-14 TOPS-10/TOPS-20 Common Math Library Reference Manual MRE because they reflect the RMS average of the "worst case" of exactness (only 27 or 62 or 59 bits correct) and the "best case" (infinite bits correct). Therefore, the closer the number of bits of accuracy of a routine approaches that of an "exact" routine, the more accurate the routine. The accuracy figures for "exact" routines for the three levels of precision are as follows. Single-Precision test interval: 0.00000 through 8192.0 MRE: 7.44x10-9 (27.0 bits) RMS: 3.11xlO-9 (28.3 bits) LSB error distribution: -2 -1 o +1 +2 0% 0% 100% 0% 0% Double-precision, D-floatlng-polnt test interval: -infinity to +infinity MRE: 2.17x10- 19 (62.0 bits) RMS: 8.81x10- 20 (63.3 bits) LSB error distribution: -2 -1 o +1 +2 0% 0% 100% 0% 0% Double-precision, G-floatlng-polnt test interval: -infinity to +infinity MRE: 1.73x10- 18 (59.0 bits) RMS: 7.05x10- 19 (60.3 bits) LSB error distribution: -2 -1 o +1 +2 0% 0% 100% 0% 0% A second test compared the result of the routines with the exact result rounded to single- or double-precision. It counted the number of times the routine's result agreed exactly with the rounded exact result, the number of times they differed by ±1 bit, ±2 bits, and so on. The result of these comparisons is expressed as a percent of error distribution for the least significant bit (LSB). Appendix A shows accuracy results derived from the ELEFUNT tests of W. J. Cody, Argonne National Laboratory. These tests show accuracy derived by testing carefully-chosen identities for each function. This appendix is provided for your information, not for comparison with the test results described above. Such a comparison would not be meaningful. Introduction 1-15 Chapter 2 Square Root Routines SQRT Description The SQRT routine calculates the single-precision, floating-point square root of its single-precision, floating-point argument. That is: .!. SQRT(x) = Vx = X 2 Routines Called SQRT calls the MTHERR routine. Type of Argument The argument must be a single-precision, floating-point value greater than or equal to 0.0. Type of Result The result returned is a single-precision, floating-point value greater than or equal to 0.0. Accuracy of Result test interval: 0.00000 through 8192.0 MRE: 8.09x10-9 (26.9 bits) RMS: 3.21x1O-9 (28.3 bits) LSB error distribution: -2 0% -1 0% o +1 +2 98% 2% 0% Algorithm Used SQRT(x) is calculated as follows. First the routine does a linear, single-precision approximation on the argument to provide an initial guess for$. The routine then does two iterations of the Newton-Raphson method, which results in an answer that is correct to, but not always including, the last bit. If x < 0.0 SQRT(x) = SQRT(lxl) If x = 0.0 SQRT(x) = 0.0 If x > 0.0 Let x = 22be f where .25 ~ f < 1.0 then Vi = 2b e v'f and Zo = 2b e (af-b) a = .82812500 if .25 ~ f < .5 = .58593750 if .5 ~ f < 1.0 b = .29722518 if .25 ~ f < .5 = .42060167 if .5 ~ f < 1.0 Square Root Routines 2-3 The Newton-Raphson method, as applied to the SQRT function, yields the following iterative approximation. Zk+l = 1/2· (Zk+X/Zk) Zk+l = the next iteration Zk = the current iteration x = Zo = the initial approximation calculated by the linear approximation the number whose square root is being calculated Error Conditions If the argument is negative, the following message is issued and the absolute value of the argument is used. SORT: Negative arg; result = SORT(ABS(arg)) 2-4 TOPS-10/TOPS-20 Common Math Library Reference Manual DSQRT Description The DSQRT routine calculates the double-precision, D-floating-point square root of its double-precision, D-floating-point argument. That is: DSQRT(x) = .JX = xt Routines Called DSQRT calls the MTHERR routine. Type of Argument The argument must be a double-precision, D-floating-point value greater than or equal to 0.0. Type of Result The result returned is a double-precision, D-floating-point value greater than or equal to 0.0. Accuracy of Result test interval: 0.00000 through 8192.0 MRE: 3.25x10- 19 (61.4 bits) RMS: 1.23x10- 19 (62.8 bits) LSB error distribution: -2 -1 o +1 0% 0% 75% 25% +2 0% Algorithm Used DSQRT(x) is calculated as follows. First the routine does a linear, single-precision approximation on the highorder word. Then the routine does two single-precision iterations of the Newton-Raphson method, followed by two double-precision iterations of the Newton-Raphson method using a value derived from the linear approximation. The linear approximation is as follows. If x < 0.0 DSQRT(x) = DSQRT(lxl) If x = 0.0 DSQRT(x) = 0.0 If x > 0.0 Let x = 22be f where .25 ~ f < 1.0 then Vx = 2b e v'f and Zo = 2b • (af-b) a = .82812500 if .25 ~ f < .5 = .58593750 if .5 ~ f < 1.0 b = .29722518 if .25 ~ f < .5 = .42060167 if .5 ~ f < 1.0 Square Root Routines 2-5 The Newton-Raphson method yields the following iterative approximation. Zk+l = 1/2·(Zk+ X/ Zk) Zk+l = the next iteration Zk = the current iteration x = the number whose square root is being calculated Zo = the initial approximation calculated by the linear approximation For the single-precision approximations, x is truncated to single-precision and all calculations are done in single-precision. For the double-precision iterations, the full double-precision value of x is used, the current value of Z2 is zero-extended to double-precision, and all remaining calculations are done in dou ble-precision. Error Conditions If the argument is negative, the following message is issued and the absolute value of the argument is used. DSQRT: Negative arg; result = DSQRT(ABS(arg)) 2-6 TOPS-10/TOPS-20 Common Math Library Reference Manual GSQRT Description The GSQRT routine calculates the double-precision, G-floating-point square root of its double-precision, G-floating-point argument. That is: GSQRT(x) = .JX = xt Routines Called GSQRT calls the MTHERR routine. Type of Argument The argument must be a double-precision, G-floating-point value greater than or equal to 0.0. Type of Result The result returned is a double-precision, G-floating-point value greater than or equal to 0.0. Accuracy of Result test interval: 0.00000 through 8192.0 MRE: 2.60x1O-- 18 (58.4 bits) RMS: 9.87x10- 19 (59.8 bits) LSB error distribution: -2 -1 o +1 +2 0% 0% 75% 25% 0% Algorithm Used GSQRT(x) is calculated as follows. First the routine does a linear, single-precision approximation on the highorder word. Then the routine does two single-precision iterations of the Newton-Raphson method, followed by two double-precision iterations of the Newton-Raphson method using a value derived from the linear approximation. The linear approximation is as follows. If x < 0.0 GSQRT(x) = GSQRT(lxl) If x = 0.0 GSQRT(x) = 0.0 If x> 0.0 Let x = 22bo f where .25 ~ f < 1.0 then .JX = 2b e.Jf and Zo = 2b e (af-b) a = .82812500 if .25 ~ f < .5 a = .58593750 if .5 ~ f < 1.0 b = .29722518 if .25 ~ f < .5 b = .42060167 if .5 ~ f < 1.0 Square Root Routines 2-7 The Newton-Raphson method yields the following iterative approximation. Zk + 1 = 1/2· ( Zk + xlZk) Zk+l = the next iteration Zk = the current iteration x = the number whose square root is being calculated Zo = the initial approximation calculated by the linear approximation For the single-precision approximations, x is truncated to single-precision and all calculations are done in single-precision. For the double-precision iterations, the full double-precision value of x is used, the current value of Z2 is zero-extended to double-precision, and all remaining calculations are done in dou ble- precision. Error Conditions If the argument is negative, the following message is issued and the absolute value of the argument is used. GSQRT: Negative arg; result = GSQRT(ABS(arg)) 2-8 TOPS-10/TOPS-20 Common Math Library Reference Manual CSQRT Description The CSQRT routine calculates the complex, single-precision square root of its complex, single-precision argument. That is: CSQRT(z) = .JZ = zt Routines Called CSQRT calls the SQRT and MTHERR routines. Type of Argument The argument must be a complex, single-precision, floating-point value; it can be any such value. Type of Result The result returned is a complex, single-precision, floating-point value, the real part of which is greater than or equal to 0.0. Accuracy of Result test interval: -1000.0 through 1000.0 real -1000.0 through 1000.0 imaginary MRE: 3.07x1o-8 (25.0 bits) real 3.05x1o-8 (25.0 bits) imaginary RMS: 7.05x10-9 (27.1 bits) real 7.33x1o- 9 (27.0 bits) imaginary LSB error distribution: -2 2% 2% -1 16% 19% o 59% 55% +1 20% 20% +2 2% real 3 % imaginary Algorithm Used CSQRT(z) is calculated as follows. Let z = x+i·y then CSQRT(z) = u+i ·v, which is defined as follows. If x~O.O u = v"-(-I x-I+-1z-I)-/2-.0- v = y/(2.0·u) If x < 0.0 and y~O.O u = y/(2.0·v) v = .J (lxl+lzl)/2.0 If x and yare both < 0.0 u = y/(2.0·v) v = -.J (lxl+lzl);2.0 The result is in the right half plane; that is, the polar angle of the result lies in the closed interval (-1r/2,+1r/2]. That is, the real part of the result is greater than or equal to 0.0. Square Root Routines 2-9 Error Conditions If the imaginary part of the input value is too small, underflow can occur on y/(2.0·u) or y/(2.0·v). If such underflow occurs, one of the following messages is issued and the relevant part of the result is set to 0.0. CSQRT: Real part underflow CSQRT: Imaginary part underflow 2-10 TOPS-10/TOPS-20 Common Math Library Reference Manual CDSQRT Description The CDSQRT subroutine calculates the complex, double-precision, D-floating-point square root of its complex, double-precision, D-floating-point argument. That is: 1 CDSQRT(z,r) = viz = z"2 Z = location of input value r = location of result Routines Called CDSQRT calls the DSQRT and MTHERR routines. Type of Arguments CDSQRT is a subroutine that is called with two arguments. Both arguments must be two-element, double-precision vectors. The first vector (z) contains the input value; the second vector (r) will contain the result. The real part of the input value must be stored in the first element of z; the imaginary part must be stored in the second element of z. The input value must be a complex, double-precision, D-floating-point value; it can be any such value. Type of Result The result returned is a complex, double-precision, D-floating-point value, the real part of which is greater than or equal to 0.0. It is returned in the second vector (r) supplied in the call. The real part of the result is returned in the first element of r; the imaginary part is returned in the second element of r. Accuracy of Result test interval: -1000.0 through 1000.0 real -1000.0 through 1000.0 imaginary MRE: 1.10x10- 1R (59.7 bits) real 1.04x10- 18 (59.7 bits) imaginary RMS: 2.69x10- 19 (61.7 bits) real 2.75x10- 19 (61.7 bits) imaginary LSB error distribution: -2 4% 5% -1 17% 24% 0 43% 41% +1 +2 32% 25% 5% real 5% imaginary Square Root Routines 2-11 Algorithm Used CDSQRT is calculated as follows. Let z = x+i·y then CDSQRT(z) = u+i ·v, which is defined as follows. If x ~ 0.0 u = V--(I-x-I+-1z-I)/-2-.0- v = y/(2.0·u) If x < 0.0 and y ~ 0.0 u = y/(2.0·v) v = V (lxl+lzl)/2.0 If x and yare both < 0.0 u = y/(2.0·v) v = -v (lil+lzl)/2.0 The result is in the right half plane; that is, the polar angle of the result lies in the closed interval [-11'/2, +71"/2]. That is, the real part of the result is greater than or equal to 0.0. Error Conditions If the imaginary part of the input value is too small, underflow can occur on y/(2.0·u) or y/(2.0·v). If such underflow occurs, one of the following messages is issued and the relevant part of the result is set to 0.0. CDSQRT: Real part underflow CDSQRT: Imaginary part underflow 2-12 TOPS-10/TOPS-20 Common Math Library Reference Manual CGSQRT Description The CGSQRT subroutine calculates the complex, double-precision, G-floating-point square root of its complex, double-precision, G-floating-point argument. That is: 1 CGSQRT(z,r) = .Ji = ZT Z = location of input value r = location of result Routines Called CGSQRT calls the GSQRT and MTHERR routines. Type of Argument CGSQRT is a subroutine that is called with two arguments. Both arguments must be two-element, double-precision vectors. The first vector (z) contains the input value; the second vector (r) will contain the result. The real part of the input value must be stored in the first element of z; the imaginary part must be stored in the second element of z. The input value must be a complex, double-precision, G-floating-point value; it can be any such value. Type of Result The result returned is a complex, double-precision, G-floating-point value; it may be any such value. It is returned in the second vector (r) supplied in the call. The real part of the result is returned in the first element of r; the imaginary part is returned in the second element of r. Accuracy of Result test interval: -1000.0 through 1000.0 real -1000.0 through 1000.0 imaginary MRE: 8.61x10- 18 (56.7 bits) real 8.78x10- 18 (56.7 bits) imaginary RMS: 2.16x10- 18 (58.7 bits) real 2.21x10- 18 (58.7 bits) imaginary -2 LSB error distribution: 5% 5% -1 16% 25% o +1 +2 41% 40% 32% 25% 5% real 5% imaginary Square Root Routines 2"':13 Algorithm Used CGSQRT(z) is calculated as follows. Let z = x+i·y then CGSQRT(z) = u+i·v is defined as follows. Ifx~O.O u = V'--(I-x-I+-lz-I)/-2-.0 v = y/(2.0·u) If x < 0.0 and y ~ 0.0 u = y/(2.0·v) v = J (lxl+lzl)/2.0 If x and yare both < 0.0 u = y/(2.0·v) v = -v (lxl+lzl)/2.0 The result is in the right half plane; that is, the polar angle of the result lies in the closed interval (-1r/2, +11"/2]. Error Conditions If the imaginary part of the argument is too small, underflow can occur on y/(2.0·u) or y/(2.0·v). If this occurs, one of the following messages is issued and the relevant part of the result is set to 0.0. CGSQRT: Real part underflow CGSQRT: Imaginary part underflow 2-14 TOPS-10/TOPS-20 Common Math Library Reference Manual Chapter 3 Logarithm Routines ALOG Description The ALOG routine calculates the single-precision, floating-point naturallogarithm of its argument. That is: ALOG(x) = loge (x) Routines Called ALOG calls the MTHERR routine. Type of Argument The argument must be a single-precision, floating-point value greater than 0.0. Type of Result The result returned is a single-precision, floating-point value in the range -89.415 to 88.029. Accuracy of Result test interval: MRE: RMS: LSB error distribution: 1.46937x10-39 through 256.00 1.84x10-8 (25.7 bits) 5.21x10-9 (27.5 bits) -2 0% -1 o 1% 81% +1 18% +2 0% Algorithm Used ALOG(x) is calculated as follows. If x = 0.0 ALOG(x) = -machine infinity If x < 0.0 ALOG(x) = ALOG(lxl) If x is close to 1.0 ALOG(x) = L3·z 7 +L4ez5+L5ez3+L6ez Z = (x-l)/(x+l) L3 = .301003281 L4 = .39965794919 L5 = .666669484507 L6 = 2.0 If x is not close to 1.0 ALOG(x) = (k-.5) eloge(2)+loge(f·v'2) x = 2k ·f loge(f e v2) = L3 ez7 +L4ez5+L5ez3+L6ez Z = (f-~ )/(f+..[5 ) Logarithm Routines 3-3 Reference Hart et. aI., Computer Approximations, (New York, N.Y.: John Wiley and Sons, 1968). The algorithm used is #2662, the coefficients are listed on page 193, and the range of validity is on page 111. Error Conditions 1. If the argument is equal to 0.0, the following message is issued and the result is set to -machine infinity. ALOG: Arg is zero; result = -infinity. 2. If the argument is less than 0.0, the following message is issued and the absolute value of the argument is used. ALOG: Negative arg, result = ALOG(ABS(arg» 3-4 TOPS-10/TOPS-20 Common Math Library Reference Manual ALOG10 Description The ALOG 10 routine calculates the single-precision, floating-point base-IO logarithm of its single-precision, floating-point argument. That is: ALOG 10(x) = loglO(X) Routines Called ALOGI0 calls the MTHERR routine. Type of Argument The argument must be a single-precision, floating-point value greater than 0.0. Type of Result The result returned is a single-precision, floating-point value in the range -38.832 to 38.230. Accuracy of Result test interval: MRE: RMS: LSB error distribution: 1.46937xl0-39 through 256.00 2.52xI0-8 (25.2 bits) 5.99xI0- 9 (27.3 bits) 2 1% I 19% 0 64% +1 15C!(1 +2 0% Algorithm Used ALOG 10(x) is calculated as follows. If x = 0.0 ALOG 10(x) = -machine infinity If x < 0.0 ALOGIO(x) = ALOG10(lxl) If x is close to 1. 0 ALOGI0(x) = loge(x) eloglO (e) loge(x) = L3 ez 7+L4 ez5+L5 ez3+L6 ez Z = (x-l)/(x+ 1) L3 = .301003281 L4 = .39965794919 L5 = .666669484507 L6 = 2.0 If x is not close to 1.0 ALOG10(x) = loge(x)eloglO (e) x = 2ke f loge(x) = (k-.5) eloge(2)+loge(f e..£) loge(f ev'2) = L3ez7+L4ez5+L5·z3+L6·z z = (f-v.5 )/(f+v.5 ) Logarithm Routines 3-5 Reference Hart et. aI, Computer Approximations, (New York, N.Y.: John Wiley and Sons, 1968). The algorithm used is #2662, the coefficients are listed on page 193, and the range of validity is on page Ill. Error Conditions 1. If the argument is 0.0, the following message is issued and the result is set to -machine infinity. ALOG 10: Arg is zero; result = -infinity 2. If the argument is less than 0.0, the following message is issued and the absolute value of the argument is used. ALOG 10: Negative arg; result = ALOG 1O(ABS(arg)) 3-6 TOPS-10/TOPS-20 Common Math Library Reference Manual OLOO Description The DLOG routine calculates the double-precision, D-floating-point natural logarithm of its double-precision, D-floating-point argument. That is: DLOG(x) = loge(x) Routines Called DLOG calls the MTHERR routine. Type of Argument The argument must be a double-precision, D-floating-point value greater than 0.0. Type of Result The result returned is a double-precision, D-floating-point value in the range -89.415 to 88.029. Accuracy of Result test interval: 1.46937x10-39 through 256.00 MRE: 9.78x10- 19 (59.8 bits) RMS: 3.03x10- 19 (61.5 bits) LSB error distribution: -2 -1 o 1% 12% 51% +1 23% +2 13% Algorithm Used DLOG(x) is calculated as follows. If x = 0.0 DLOG(x) = -machine infinity If x < 0.0 DLOG(x) = DLOG(lxl) If x> 0.0 x = 2k ·f where .5 < f < 1.0 and g and n are defined so that f = 2- n • g where 1/v'2s g < .J2 Then DLOG(x) = (k-n) ·loge(2) +loge(g) loge(g) is evaluated by defining s = (g -l)/(g+ 1) and z = 2·s and then calculating loge(g) = loge«1+z/2)/(1 --z/2)) using a minimax rational approximation. Logarithm Routines 3-7 Error Conditions 1. If the argument is equal to 0.0, the following message is issued and the result is set to -machine infinity. DLOG: Arg is zero; result = -infinity 2. If the argument is less than 0.0, the following message is issued and the absolute value of the argument is used. DLOG: Negative arg; result = DLOG(ABS(arg)) 3-8 TOPS-10/TOPS-20 Common Math Library Reference Manual DLOG10 Description The DLOG 10 routine calculates the double-precision, D-floating-point base10 logarithm of its double-precision D-floating-point argument. That is: DLOG 10(x) = loglO(x) Routines Called DLOG 10 calls the MTHERR routine. Type of Argument The argument must be a double-precision, D-floating-point value greater than 0.0. Type of Result The result returned is a double-precision, D-floating-point value in the range -38.832 to 38.320. Accuracy of Result test interval: MRE: RMS: LSB error distribution: 1.46937xl0-:~9 through 256.00 1.20xl0- 18 (59.5 bits) 3.65xlo-- 19 (61.2 bits) -2 30;() -1 17% 0 38% +1 26% +2 14% +3 2% Algorithm Used DLOG 10(x) is calculated as follows. If x = 0.0 DLOG 10(x) = -machine infinity If x < 0.0 DLOGIO(x) = DLOGIO(lxl) If x > 0.0 x ~ 2k -f where .5 < f < 1.0 and g and n are defined so that f = 2- n -g where l/Vi :5 g < V2 Then DLOG 10(x) = 10glO(e) -loge(x) = loge(x)/loge(lO) loge(g) is evaluated by defining s = (g -l)/(g+l) and z = 2-s and then calculating loge(g) = loge«l +z/2)/(l -z/2» using a minimax rational approximation. Logarithm Routines 3-9 Error Conditions 1. If the argument is equal to 0.0, the following message is issued and the result is set to -machine infinity. DLOG 10: Arg is zero; result = -infinity 2. If the argument is less than 0.0, the following message is issued and the absolute value of the argument is used. DLOG10: Negative arg; result = DLOG10(ABS(arg» 3-10 TOPS-10/TOPS-20 Common Math Library Reference Manual GLOG Description The GLOG routine calculates the double-precision, G-floating-point natural logarithm of its double-precision, G-floating-point argument. That is: GLOG(x) = loge(x) Routines Called GLOG calls the MTHERR routine. Type of Argument The argument must be a double-precision, G-floating-point value greater than 0.0. Type of Result The result returned is a double-precision, G-floating-point value in the range -710.475 to 709.089. Accuracy of Result test interval: 0.00000 through 256.00 MRE: 5.13x10-18 (57.4 bits) RMS: 1.26x10- 18 (59.5 bits) LSB error distribution: -2 0% -1 10% o 74% +1 16% +2 0% Algorithm Used GLOG(x) is calculated as follows. If x = 0.0 GLOG(x) = machine infinity If x < 0.0 GLOG(x) = GLOG(lxl) If x> 0.0 x = 2k -f where .5 < f < 1.0 and g and n are defined so that f = 2- n - g where 1/v'2::; g < v2 Then GLOG(x) = (k-n) -loge(2) +loge(g) loge(g) is evaluated by defining s = (g-1)/(g+ 1) and z = 2-s and then calculating loge(g) = loge((1+z/2)/(l-z/2)) using a minimax rational approximation. Logarithm Routines 3-11 Error Conditions 1. If the argument is equal to 0.0, the following message is issued and the result is set to -machine infinity. GLOG: Arg is zero; result = -infinity 2. If the argument is negative, the following message is issued and the absolute value of the argument is used. GLOG: Negative arg; result = GLOG(ABS(arg» 3-12 TOPS-10/TOPS-20 Common Math Library Reference Manual GLOG10 Description The GLOG 10 routine calculates the double-precision, G-floating-point base10 logarithm of its double-precision, G-floating-point argument. That is: GLOG 10(x) = 10glO(x) Routines Called GLOGI0 calls the MTHERR routine. Type of Argument The argument must be a double-precision, G-floating-point value greater than 0.0. Type of Result The result returned is a double-precision, G-floating-point value in the range -308.555 to 307.953. Accuracy of Result test interval: 2.78134xl0--:109 through 256.00 MRE: 6.05xlO"IR (57.2 bits) RMS: 1.42xl0- 1R (59.3 bits) LSB error distribution: o 62% +1 18% +2 0% Algorithm Used GLOG 10(x) is calculated as follows. If x = 0.0 GLOG 10(x) = -machine infinity If x < 0.0 GLOGI0(x) = GLOG10(lxl) If x > 0.0 x =" 2k ·f where .5 < f < 1.0 and g and n are defined so that f = 2- n - g where 1/V2 ~ g < .J2 Then GLOG 10(x) = 10glO( e) -loge(x) = loge(x)/loge(10) loge(g) is evaluated by defining s = (g-l)/g+l) and z = 2-s and then calculating loge(g) = loge( (1 +z/2)/(1-z/2» using a minimax rational approximation. Logarithm Routines 3-13 Error Conditions 1. If the argument is equal to 0.0, the following message is issued and the result is set to -machine infinity. GLOG10: Arg is zero; result = -infinity 2. If the argument is negative, the following message is issued and the absolute value of the argument is used. GLOG10: Negative arg; result = GLOG10(ABS(arg)) 3-14 TOPS-10/TOPS-20 Common Math Library Reference Manual CLOG Description The CLOG routine calculates the complex, single-precision, floating-point natural logarithm of its complex, single-precision, floating-point argument. That is: CLOG(z) = loge(z) Routines Called CLOG calls the ALOG, ATAN, ATAN2, and MTHERR routines. Type of Argument The argument must be a complex, single-precision, floating-point value, both parts of which cannot be equal to 0.0, although either can be equal to 0.0. Type of Result The result returned is a complex, single-precision, floating-point value. The real part of the result is in the range -89.415 to 88.029; the imaginary part is in the range -7r to 7r. Accuracy of Result test interval: -1000.0 through 1000.0 real -100.00 through 100.00 imaginary MRE: 5.30x10-5 (14.2 bits) real 1.49x1O-8 (26.0 bits) imaginary RMS: 1.06x10-7 (23.2 bits) real 3.44x1O-9 (28.1 bits) imaginary -3 -2 -1 0 +1 +2 1% 1% 1C]'o 6% 82% 7% 1% real 0% 0% 0% 3% 94% 3% 0% imaginary ~4+ LSB error distribution: Algorithm Used CLOG(z) is calculated as follows. Let z = x+i-y If x = 0.0 and y = 0.0 CLOG(z) = (+infinity, 0.0) * If x = 0.0 and y 0.0 CLOG(z) = loge(lyl)+i-sgn(Y)-7r/2 Logarithm Routines 3-15 If x =1= 0.0 and y = 0.0 If x > 0.0 CLOG(z) = loge(x)+i -0.0 If x < 0.0 CLOG(z) = loge(lxl) +i-1I" If x =1= 0.0 and y =1= 0.0 CLOG(z) = u+i-v u = .5 -loge(x 2+y2) v = tan-1(y/x) Scaled values are calculated on occurences of overflow/underflow for (X 2,y2) or (X2+y2) and propagated to give a valid in-range result for u. Error Conditions 1. If both parts of the argument equal 0.0, the following message is issued and the result is set to (+infinity, 0.0). CLOG; Arg is zero; result = (+infinity, zero) 2. If either part of the result underflows, one or both of the following mes- sages are issued and the relevant part of the result is set to 0.0. CLOG: Real part underflow CLOG: Imaginary part underflow 3-16 TOPS-10/TOPS-20 Common Math Library Reference Manual CDLOG Description The CDLOG subroutine calculates the complex, double-precision, D-floatingpoint natural logarithm of its complex, double-precision, D-floating-point argument. That is: CDLOG(z,r) = loge(z) z = location of input value r = location of result Routines Called CDLOG calls the DLOG, DATAN, DATAN2, and MTHERR routines. Type of Argument CDLOG is a subroutine that is called with two arguments. Both arguments must be two-element, double-precision vectors. The first vector (z) contains the input value; the second vector (r) will contain the result. The real part of the input value must be stored in the first element of z; the imaginary part must be stored in the second element of z. The input value must be a complex, double-precision, D-floating-point value, both parts of which cannot be equal to 0.0, although either can be equal to 0.0. Type of Result The result returned is a complex, double-precision, D-floating-point value. The real part of the result is in the range -89.415 to 88.376; the imaginary part is in the range -7r to 7r. The result is returned in the second vector (r) supplied in the call. The real part of the result is returned in the first element of r; the imaginary part is returned in the second element of r. Accuracy of Result test interval: -1000.0 through 1000.0 real --100.00 through 100.00 inlaginary MRE: 9.07x10- 16 (50.0 bits) real 5.09x1o-- 19 (60.8 bits) imaginary RMS: 1.59x10- 18 (59.1 bits) real 1.04xlo- 19 (63.1 bits) imaginary LSB error distribution: -4+ -3 -2 -1 0 + 1 +2 1% 1% 1% 5% 84% 6% 10;(-) real 0% 0% 0% 4% 92% 4% 0% imaginary Logarithm Routines 3-17 Algorithm Used CDLOG is calculated as follows. Let z = x+i·y If x = 0.0 and y = 0.0 CDLOG(z) = (+infinity, 0.0) If x = 0.0 and y 7'= 0.0 CDLOG(z) = loge(lyl)+i ·sgn(y) ·7r/2 If x 7'= 0.0 and y = 0.0 If x> 0.0 CDLOG(z) = loge(x)+i ·0.0 If x < 0.0 CDLOG(z) = loge(lxl) +i "7r If x 7'= 0.0 and y 7'= 0.0 CDLOG(z) = u+i·v u = .5 ·loge(x 2+y2) v = tan- 1 (y,x) Scaled values are calculated on occurrences of overflow/ underflow for (x 2, y2) or (X2+y2) and progagated to give a valid inrange result for u. Error Conditions 1. If both parts of the argument equal 0.0, the following message is issued and the result is set to (+infinity, 0.0). CDLOG: Arg is zero; result = (+infinity, zero) 2. If either part of the result underflows, one or both of the following messages are issued and the relevant part of the result is set to 0.0. CDLOG: Imaginary part underflow CDLOG: Real part underflow 3-18 TOPS-10/TOPS-20 Common Math Library Reference Manual CGLOG Description The CGLOG subroutine calculates the complex, double-precision, G-f1oatingpoint natural logarithm of its complex, double-precision, G-floating-point argument. That is: CGLOG(z,r) = loge(z) z = location of input value r = location of result Routines Called CGLOG calls the GLOG, GATAN, GATAN2, and MTHERR routines. Type of Argument CGLOG is a subroutine that is called with two arguments. Both arguments must be two-element, double-precision vectors. The first vector (z) contains the input value; the second vector (r) will contain the result. The real part of the input value must be stored in the first element of z; the imaginary part must be stored in the second element of z. The input value must be a complex, double-precision, G-floating-point value, both parts of which cannot be equal to 0.0, although either can be equal to 0.0. Type of Result The result returned is a complex, double-precision, G-floating-point value. The real part of the result is in the range -710.475 to 709.436; the imaginary part is in the range -7r to 7r. The result is returned in the second vector (r) supplied in the call. The real part of the result is returned in the first element of r; the imaginary part is returned in the second element of r. Accuracy of Result test interval: -1000.0 through 1000.0 real -100.00 through 100.00 imaginary MRE: 7.15x10-- 11i (47.0 bits) real 3.54x10- 18 (58.0 bits) imaginary RMS: 1.77x10-- 17 (55.7 bits) real 8.19x10- 19 (60.1 bits) imaginary LSB error distribution: -4+ -3 -2 -1 0 +1 +2 1% 0% 1% 5% 86~)'i) 6% 1% real 0% 0% 0% 4% 92% 4(Yc, 0% imaginary Logarithm Routines 3-19 Algorithm Used CGLOG(z) is calculated as follows. Let z = x+i-y If x = 0.0 and y = 0.0 CGLOG(z) = +machine infinity If x = 0.0 and y =1= 0.0 CGLOG(g) = loge(lyl)+i -sgn(y) -11'"/2 If x =1= 0.0 and y = 0.0 If x> 0.0 CGLOG(z) = loge(x) +i -0.0 If x < 0.0 CGLOG(z) = loge(lxl)+i-1I'" If x =1= 0.0 and y =1= 0.0 CGLOG(z) = u+i-v u = .5 -loge(x 2+ y2) V = tan- 1(y/x) Scaled values are calculated on occurrence of overflow/underflow for (x 2, y2) or (X2+y2) and propagated to give a valid in-range result for u. Error Conditions 1. If both parts of the argument equal 0.0, the following message is issued and the result is set to (+machine infinity, 0.0). CGLOG: Arg is zero; result = (+infinlty, zero) 2. If either part of the result underflows, one or both of the following messages are issued and the relevant part of the result is set to 0.0. CGLOG: Real part underflow CGLOG: Imaginary part underflow 3-20 TOPS-10/TOPS-20 Common Math Library Reference Manual Chapter 4 Exponential and Exponentiation Routines EXP Description The EXP routine calculates· the single-precision, floating-point exponential function of its single-precision, floating-point argument. That is: EXP(x) = eX Routines Called EXP calls the MTHERR routine. Type of Argument The argument must be a single-precision, floating-point value in the range -89.4159863 to 88.0296919. Type of Result The result returned is a single-precision, floating-point value greater than zero. Accuracy of Result test interval: -89.000 through 88.000 MRE: 1.74x10-8 (25.8 bits) RMS: 3.98x10-9 (27.9 bits) LSB error distribution: -2 0% -1 2% o 86% +1 12% +2 0% Algorithm Used EXP(x) is calculated as follows. If x < -89.4159863 EXP(x) = 0.0 If x > 88.0296919 EXP(x) = +machine infinity Otherwise, the argument is reduced as follows: Let n = the nearest integer to x/loge(2) The reduced argument is: g = x-n -loge(2) The calculation is: EXP(x) = R(g) _2(n+1) R(g) = .5+g·p/(q-g-p) P = p1-g 2 +.25 q = q1-g 2 +.5 pI = .00416028863 q1 = .0499871789 Exponential and Exponentiation Routines 4-3 Error Conditions 1. If the argument is less than -89.4159863, the following message is issued and the result is set to 0.0. EXP: Result underflow 2. If the argument is greater than 88.0296919, the following message is issued and the result is set to +machine infinity. EXP: Result overflow 4-4 TOPS-10/TOPS-20 Common Math Library Reference Manual DEXP Description The DEXP routine calculates the double-precision, D-floating-point exponential function of its double-precision, D-floating-point argument. That is: DEXP(x) = eX Routines Called DEXP calls the MTHERR routine. Type of Argument The argument must be a double-precision, D-floating-point value in the range -89.415986292232944914 to 88.029691931113054295. Type of Result The result returned is a double-precision, D-floating-point value greater than zero. Accuracy of Result test interval: -89.000 through 88.000 MRE: 4.89x10- 19 (60.8 bits) RMS: 1.17x10-- 19 (62.9 bits) LSB error distribution: -2 0% -1 2% o 86% +1 12% +2 0% Algorithm Used DEXP(x) is calculated as follows. If x < -89.415986292232944914 DEXP(x) = 0.0 If x > 88.029691931113054295 DEXP(x) = +rnachine infinity Otherwise, the argument is reduced as follows: Let xl = [x], the greatest integer in x x2 = x-xl n = the nearest integer to x/loge (2) The reduced argument is: g = x1-n·c1+x2+n·c2 cl = .543 R c2 = loge(2)-.543 R Exponential and Exponentiation Routines 4-5 The calculation is: DEXP(x) = R(g) e 2(n+1) R(g) = .5+g ep/(q_gep) p = (((p2 eg2+p1) eg2)+pO) eg2 q = ((((q3 eg2+q2) -g2)+q1) eg2)+qO pO = .250 p1 = .757531801594227767x10-2 p2 = .315551927656846464x10-4 qO =.5 q1 = .568173026985512218x10- 1 q2 = .631218943743985036x10-3 q3 = .751040283998700461x10-6 Error Conditions 1. If the argument is less than -89.415986292232944914, the following message is issued and the result is set to 0.0. OEXP: Result underflow 2. If the argument is greater than 88.029691931113054295, the following message is issued and the result is set to +machine infinity. OEXP: Result overflow 4-6 TOPS-10/TOPS-20 Common Math Library Reference Manual GEXP Description The GEXP routine calculates the double-precision, G-floating-point exponential function of its double-precision, G-floating-point argument. That is: GEXP(x) = eX Routines Called GEXP calls the MTHERR routine. Type of Argument The argument must be a double-precision, G-floating-point value in the range -710.475860073943942 to 709.08956571282405l. Type of Result The result returned is a double-precision, G-floating-point value greater than or equal to zero. Accuracy of Result test interval: -89.000 through 88.000 MRE: 3.99x10- 18 (57.8 bits) RMS: 9.40x10- 19 (59.9 bits) -2 0% LSB error distribution: -1 2% o 85% +1 13% +2 0% Algorithm Used GEXP(x) is calculated as follows. If x :s; -710.475860073943942 GEXP(x) = 0.0 If x > 709.089565712824051 GEXP(x) = +machine infinity Otherwise, the argument is reduced as follows: Let xl = [x], the greatest integer in x x2 = x-xl n = the nearest integer to x/loge(2) The reduced argument is: g = x1-n e c1+x2+n e c2 c1 = .5438 c2 = loge(2)-.543s Exponential and Exponentiation Routines 4-7 The calculation is: GEXP(x) = R(g) ·2(n+l) R(g) = .5+g ep/(q_gep) p = «(p2eg2+pl) eg2)+pO) eg2 q = ««q3 eg2+q2)·g2)+ql)·g2)+qO pO = .250 pi = .757531801594227767xl0- 2 p2 = .315551927656846464xlO-4 qO =.5 ql = .568173026985512218xl0- 1 q2 = .631218943743985036xlO- 3 q3 = .751040283998700461xl0- 6 Error Conditions 1. If the argument is less than or equal to -710.475860073943942, the following message is issued and the result is set to 0.0. GEXP: Result underflow 2. If the argument is greater than 709.089565712824051, the following Inessage is issued and the result is set to +machine infinity. GEXP: Result overflow 4-8 TOPS-10/TOPS-20 Common Math Library Reference Manual CEXP Description The CEXP routine calculates the complex, single-precision, floating-point exponential function of its complex, single-precision, floating-point argument. That is: CEXP(z) = eZ Routines Called CEXP calls the EXP, COS, SIN, and MTHERR routines. Type of Argument The argument must be a complex, single-precision, floating-point value in the range -89.4159863 to 176.0593838 for the real part and less than 823549.66 for the imaginary part. Type of Result The result returned is a complex, single-precision, floating-point value; it may be any such value. Accuracy of Result test interval: -40.000 through 12.000 real -10.000 through 157.08 imaginary MRE: 2.77x10-8 (25.1 bits) real 2.88x10-8 (25.0 bits) imaginary RMS: 6.51x10-9 (27.2 bits) real 6.38x10-9 (27.2 bits) imaginary LSB error distribution: -2 1% 1% -1 19% 17% o +1 +2 58% 59% 21% 23% 1% real 1% imaginary Algorithm Used CEXP(z) is calculated as follows. Letz=x+i·y If Iyl > 823549.66 CEXP(z) = (0.0,0.0) If x < -89.4159863 CEXP(z) = (0.0,0.0) If x > 88.0296919 and y = 0.0 CEXP(z) = (+infinity, 0.0) If 88.0296919 < x < 176.0593838 and a component of the result is out of range, that component is set to +infinity. If x > 176.0593838 and y =1= 0.0 CEXP(z) = (± infinity, ± infinity) Otherwise CEXP(z) = eXe(cos(y)+iesin(y» Exponential and Exponentiation Routi nes 4-9 Error Conditions The following table gives the possible error conditions and the resulting error messages. Error Conditions for CEXP Real Part of Argument Imaginary Part of Argument Result Error Message(s) Any Value > 823549.66 (0.0,0.0) #1 (0.0,0.0) #2 Not 0.0 and -s; 823549.66 (0.0,0.0) #2 and #3 Not 0.0 and ~ 823549.66 Underflow may occur on neither, either, or both parts None or #2 or #3 or #2 and #3 (+infinity, 0.0) #4 < -89.4159863 Between -89.41598663 and 88.0296919 > 88.0296919 0.0 0.0 > 176.0593838 Not 0.0 and ~ 823549.66 (± infinity, ± infinity) #4 and #5 Between 88.0296919 and 176.0593838 Not 0.0 and ~ 823549.66 Overflow may occur on neither, either, or both parts None or #4 or #5 or #4 and #5 Error Messages: 1. CEXP:ABS(IMAG(arg» too large; result = zero 2. CEXP: Real part underflow 3. CEXP: Imaginary part underflow 4. CEXP: Real part overflow 5. CEXP: Imaginary part overflow 4-10 TOPS-10/TOPS-20 Common Math Library Reference Manual CDEXP Description The CDEXP subroutine calculates the complex, double-precision, D-floatingpoint exponential function of its complex, double-precision, D-floating-point argument. That is: CDEXP(z,r) eZ Z = location of input value r = location of result = Routines Called CDEXP calls the DEXP, DSIN, DCOS, and MTHERR routines. Type of Argument CDEXP is a subroutine that is called with two arguments. Both arguments must be two-element, double-precision vectors. The first vector (z) contains the input value; the second vector (r) will contain the result. The real part of the input value must be stored in the first element of z; the imaginary part must be stored in the second element of z. The input value must be a complex double-precision, D-floating-point value in the range -89.415986292232944914 to 176.059383862226109 for the real part and less than 6746518850.429 for the imaginary part. Type of Result The result returned is a complex, double-precision, D-floating-point value. It is returned in the second vector (r) supplied in the call. The real part of the result is returned in the first element of r; the imaginary part is returned in the second element of r. Accuracy of Result test interval: -40.000 through 12.000 real -10.000 through 157.08 imaginary MRE: 8.78x10- 19 (60.0 bits) real 9.49x10- 19 (59.9 bits) imaginary RMS: 1.90x10- 19 (62.2 bits) real 1.87x1o- 19 (62.2 bits) imaginary LSB error distribution: -2 -1 o +1 +2 1% 1% 23% 20% 57% 59% 18% 19% 1% real 1% imaginary Exponential and Exponentiation Routines 4--11 Algorithm Used CDEXP is calculated as follows. Letz=x+i·y If Iyl > 6746518850.429 CDEXP(z) = (0.0,0.0) If x < -89.415986292232944914 CDEXP(z) = (0.0,0.0) If x > 88.029691931113054295 and y = 0.0 CDEXP(z) = (+infinity, 0.0) If 88.029691931113054295 < x < 176.059383862226109 and a component of the result is out of range, that component is set to +infinity. If x > 176.059383862226109 and y :#= 0.0 CDEXP(z) = (± infinity, ± infinity). Otherwise CDEXP(z) = eXe(cos(y)+i esin(y)) Error Conditions The following table gives the possible error conditions and the resulting error messages. Error Conditions for CDEXP Real Part of Argument Imaginary Part of Argument Result Error Message(s) Any Value > 6746518850.429 (0.0,0.0) #1 < -89.415986292232944914 0.0 (0.0,0.0) #2 Not 0.0 and (0.0,0.0) #2 and #3 Underflow may occur on neither, either, or both parts None or #2 or #3 or #2 and #3 $ 6746518850.429 Between Not 0.0 and -89.415986292232944914 and 88.02969193113054295 $ > 88.02969193113054295 0.0 (+infinity, 0.0) #4 > 176.059383862226109 Not 0.0 and (± infinity, ± infinity) #4 and #5 Overflow may occur on neither, either, or both parts None or #4 or #5 or #4 and #5 6746518850.429 $ 6746518850.429 Between Not 0.0 and 88.02969193113054295 and 176.059383862226109 $ 6746518850.429 Error Messages: 1. CDEXP:ABS(IMAG(arg» too large; result = zero 2. CDEXP: Real part underflow 3. CDEXP: Imaginary part underflow 4. CDEXP: REAL(arg) too large; REAL(result) = +infinity 5. CDEXP: REAL(arg) too large; IMAG(result) = +infinity 4-12 TOPS-10/TOPS-20 Common Math Library Reference Manual CGEXP Description The CGEXP subroutine calculates the complex, double-precision, G-floatingpoint exponential function of its complex, double-precision, G-floating-point argument. That is: CGEXP(z,r) eZ Z = location of input value r = location of result = Routines Called CGEXP calls the GEXP, GSIN, GCOS, and the MTHERR routines. Type of Argument CGEXP is a subroutine that is called with two arguments. Both arguments must be two-element, double-precision vectors. The first vector (z) contains the input value; the second vector (r) will contain the result. The real part of the input value must be stored in the first element of z; the imaginary part must be stored in the second element of z. The input value must be a complex, double-precision, G-floating-point value in the range -710.475860073943942 to 1418.179131425648102 for the real part and less than 1686629713.065 for the imaginary part. Type of Result The result returned is a complex, double-precision, G-floating-point value. It is returned in the second vector (r) supplied in the call. The real part of the result is returned in the first element of r; the imaginary part is returned in . the second element of r. Accuracy of Result test interval: -40.000 through 12.000 real -10.000 through 157.08 imaginary MRE: 6.50xl0- 18 (57.1 bits) real 6.67xl0- 18 (57.1 bits) imaginary RMS: ·1.53xl0- 18 (59.2 bits) real 1.44xl0-18 (59.3 bits) imaginary LSB error distribution: -2 1% 0% -1 19% 16% o +1 +2 57% 60% 22% 22% 1% real 1% imaginary Exponential and Exponentiation Routi nes 4-13 Algorithm Used CGEXP(z) is calculated as follows. Let z = x+iey If Iyl > 1686629713.065 CGEXP(z) = (0.0,0.0) If x < -710.475860073943942 CGEXP(z) = (0.0,0.0) If x > 709.089565 and y = 0.0 CGEXP(z) = (+infinity, 0.0) If 709.089565 < x < 1418.179131425648102 and a component of the result is out of range, that component is set to +infinity. If x > 1418.179131425648102 and y =1= 0.0 CGEXP(z) = (±infinity, ±infinity) Otherwise CGEXP(z) = eXe(cos(y)+i esin(y» Error Conditions The table below shows the possible values of the argument that could cause error conditions. Error Conditions for CGEXP Real Part of Argument Imaginary Part of Argument Result Error Messages Any value > 1686629713.065 (0.0,0.0) #1 < -710.475860073943942 0.0 (0.0,0.0) #2 Not 0.0 and (0.0,0.0) #2 and #3 Underflow may occur on neither, either, or both parts None or #2 or #3 or #2 and #3 =:; 1686629713.065 Between Not 0.0 and -710.475860073943942 and 709.089565 =:; 1686629713.065 > 709.089565 0.0 (infinity, 0.0) #4 > 1418.179131425648102 Not 0.0 and ( ± infinity, ± infinity) #4 and #5 Overflow may occur on neither, either, or both parts None or #4 or #5 or #4 and #5 =:; 1686629713.065 Between Not 0.0 and 709.089565 and 1418.179131425648102 =:; 1686629713.065 Error Messages: 1. CGEXP: ABS(lMAG(arg» too large; result = zero 2. CGEXP: Real part underflow 3. CGEXP: Imaginary part underflow 4. CGEXP: REAL(arg) too large; REAL(result) = +infinity 5. CGEXP: REAL(arg) too large; IMAG(result) = +infinity 4-14 TOPS-10/TOPS-20 Common Math Library Reference Manual EXP1. Description The EXPl. routine raises one integer to the power of another integer. That is: EXPl.(m,n) = mn Routines Called EXPl. calls the MTHERR routine. Type of Arguments The two arguments must be integer values; they can be any such values. Type of Result The result returned is an integer value; it may be any such value. Accuracy of Result The result is exact. Algorithm Used EXPl.(m,n) is calculated as shown in the following table. Calculations for EXP1. Value of m Value of n Result *0 0 1 0 0 0 0 >0 0 0 <0 +infinity +1 any value 1 -1 even 1 -1 odd -1 *±1 <0 0 *±1 >0 mn Error Conditions 1. If the exponent is too large a number, the following message is issued and the result is set to ± infinity. EXP1.: Result overflow 2. If both the base and the exponent are 0, the following message is issued and the result is set to O. EXP1.: Zero**zero is indeterminate, result = zero Exponential and Exponentiation Routines 4-15 EXP2. Description The EXP2. routine raises a single-precision, floating-point number to the power of an integer. That is: EXP2.(x,n) = xn Routines Called EXP2. calls the MTHERR routine. Types of Arguments There are two arguments. The base must be a single-precision, floating-point value, and the exponent must be an integer value. They can be any such values. Type of Result The result returned is a single-precision, floating-point value; it may be any such value. Accuracy of Result MRE test Interval x RMS n 9 3.48xlO-9 (28.1 bits) 3.07xlO- (25.0 bits) B 8.88xl0-9 (26.7 bits) 9 5.53xlO-B (24.1 bits) 1.61xl0- B (25.9 bits) .50000 through 1.0000 -12 7.91xlO- (23.6 bits) .50000 through 1.0000 15 9.08xlO- (23.4 bits) .50000 through 1.0000 -20 1.27xlO- (22.9 bits) .50000 through 1.0000 40 .50000 through 1.0000 2 7.45xlO- (27.0 bits) .50000 through 1.0000 -·5 .50000 through 1.0000 total B 2.37xl0- (25.3 bits) 8 2.70x10- B (25.1 bits) 7 3.95xlO- (24.6 bits) 2.65xlO- (21.8 bits) 7 7.87x10- B (23.6 bits) 2.65xlO- 7 (21.8 bits) 3.67xlO-8 (24.7 bits) 8 B LSB error distribution according to the value of n 2 -410% -3 0% -2 0% -1 0% 0 100% +1 0% +2 0% +3 0% +4+ 0% n= -5 0% 0% 5% 24% 41% 25% 5% 0% 0% n= 9 1% 4% 13% 21% 23% 21% 13% 4% 1% n = -12 7% 8% 13% 15% 15% 15% 12% 8% 7% n= 15 9% 9% 12% 13% 13% 13% 12% 9% 9% n = -20 20% 8% 9% 9% 9% 9% 9% 8% 20% n= 40 34% 4% 5% 5% 5% 5% 5% 5% 34% total 10% 5% 8% 12% 29% 12% 8% 5% 10% n= 4-16 TOPS-10/TOPS-20 Common Math Library Reference Manual Algorithm Used EXP2.(x,n) is calculated as shown in the following table. Calculations for EXP2. Value of x Value of n Result *0.0 0 1.0 0.0 0 0.0 0.0 >0 0.0 0.0 <0 +infinity > 0.0 >0 X n Error Conditions 1. If the exponent has sufficiently large magnitude, overflow occurs in one of the following ways: Base Exponent Result > 1.0 positive +infinity < -1.0 positive, even positive, odd +infinity -infinity 0.0 to 1.0 negative +infinity -1.0 to 0.0 negative, even negative, odd +infinity -infinity and the following message is issued. EXP2.: Result overflow 2. If the exponent has sufficiently large magnitude, underflow occurs in one of the following ways: Magnitude of Base Exponent Result > 1.0 negative 0.0 < 1.0 positive 0.0 and the following message is issued. EXP2.: Result underflow 3. If both the exponent and the base are zero, the following message is issued and a result of zero is returned. EXP2.: Zero··zero is indeterminate, result = zero Exponential and Exponentiation Routines 4-17 DEXP2. Description The DEXP2. routine raises a double-precision, D-floating-point number to the power of an integer. That is: DEXP2.(x,n) = xn Routines Called DEXP2. calls the MTHERR routine. Type of Arguments There are two arguments. The base must be a double-precision, D-floatingpoint value, and the exponent must be an integer value. They can be any such values. Type of Result The result returned is a double-precision, D-floating-point value; it may be any such value. Accuracy of Result test Interval x n .50000 through 1.0000 MRE RMS 2 2.16xlO- 19 (62.0 bits) 1.01x10- 19 (63.1 bits) .50000 through 1.0000 -9 1.62xlO- 18 (59.1 bits) 4.72x10- 19 (60.9 bits) .50000 through 1.0000 12 2.27x10- 18 (58.6 bits) 6.79x10- 19 (60.4 bits) .50000 through 1.0000 15 2.73xlO- 18 (58:3 bits) 7.89x10- 19 (60.1 bits) .50000 through 1.0000 -40 7.50x10- 18 (56.9 bits) 2.31xlO- 18 (58.6 bits) 7.50xlO- 18 (56.9 bits) 1.15x10- 18 (59.6 bits) total LSB error distribution according to the value of n 2 -4+ 0% -3 0% -2 0% -1 0% 0 100% +1 0% +2 0% +3 0% +4+ 0% n= -9 1% 4% 12% 20% 23% 20% 12% 5% 2% n= 12 6% 8% 12% 15% 16% 15% 13% 9% 6% n= 15 9% 9% 12% 13% 13% 13% 12% 9% 9% n = -40 34% 4% 5% 4% 5% 5% 4% 4% 34% total 10% 5% 8% 11% 31% 11% 8% 5% 10% n= 4-18 TOPS-10/TOPS-20 Common Math Library Reference Manual Algorithm Used DEXP2.(x,n) is calculated as shown in the following table. Calculations for DEXP2. Value of x Value of n Result *0.0 0 1.0 0.0 0 0.0 0.0 >0 0.0 0.0 <0 +infinity > 0.0 >0 n X Error Conditions 1. If the exponent has sufficiently large magnitude, overflow occurs in one of the following ways: Base Exponent Result > 1.0 positive +infinity <-1.0 positive, even positive, odd +infinity -infinity 0.0 to 1.0 negative +infinity -1.0 to 0.0 negative, even negative, odd +infinity -infinity and the following error message is issued. DEXP2.: Result overflow 2. If the exponent has sufficiently large magnitude, underflow occurs in one of the following ways: Magnitude of Base Exponent Result > 1.0 negative 0.0 < 1.0 positive 0.0 and the following message is issued. DEXP2.: Result underflow 3. If both the exponent and the base are zero, the following message is issued and the result is set to zero. DEXP2.: Zero··zero is indeterminate, result = zero Exponential and Exponentiation Routines 4-19 GEXP2. Description The GEXP2. routine raise a double-precision, G-floating-point number to the power of an integer. That is: GEXP2.(x,n) = xn Routines Called GEXP2. calls the MTHERR routine. Type of Arguments There are two arguments. The base must be a double-precision, G-floatingpoint value; it can be any such value. The exponent must be an integer value; it can be any such value. Type of Result The result returned is a double-precision, G-floating-point value; it may be any such value. Accuracy of Result test Interval MRE RMS 1.72xlO- 18 (59.0 bits) 1.26xlO- 17 (56.1 bits) 8.11xl(f19 (60.1 bits) 3.79xl0- 18 (57.9 bits) 15 1.69xlO- 17 (55.7 bits) 2.13xlO- 17 (55.4 bits) 5.45xl0- 18 (57.3 bits) 6.27xl0- 18 (57.1 bits) -40 5.64xlo- 17 (54.0 bits) 1.85xHr 17 (55.6 bits) 5.64xlo- 17 (54.0 bits) 9.25xlo- 18 (56.6 bits) x n .50000 through 1.0000 2 .50000 through 1.0000 -9 .50000 through 1.0000 12 .50000 through 1.0000 .50000 through 1.0000 total LSB error distribution according to the value of n 2 -4+ 0% -3 0% -2 0% -1 0% 0 100% +1 0% +2 0% +3 0% +4+ 0% n= -9 2% 5% 12% 21% 23% 20% 12% 4% 1% n= 12 6% 8% 13% 16% 15% 15% 13% 8% 6% n= 15 9% 9% 12% 13% 14% 13% 12% 9% 9% n = -40 34% 4% 4% 5% 4% 5% 5% 4% 34% total 10% 5% 8% 11% 31% 10% 8% 5% 10% n= 4-20 TOPS-10/TOPS-20 Common Math Library Reference Manual Algorithm Used GEXP2.(x,n) is calculated as shown in the following table. Calculations for GEXP2. Value of x Value of n Result *0.0 0 1.0 0.0 0 0.0 0.0 >0 0.0 0.0 <0 +infinity > 0.0 >0 xn Error Conditions 1. If the exponent has sufficiently large magnitude, overflow occurs in one of the following ways: Base Exponent Result > 1.0 positive +infinity <-1.0 positive, even positive, odd +infinity -infinity 0.0 to 1.0 negative +infinity -1.0 to 0.0 negative, even negative, odd +infinity -infinity and the following error message is issued: G EXP2.: Result overflow 2. If the exponent has sufficiently large magnitude, underflow occurs in one of the following ways: Magnitude of Base Exponent Result > 1.0 negative 0.0 < 1.0 positive 0.0 and the following message is issued: GEXP2.: Result underflow 3. If both the exponent and the base are zero, the following message is issued and the result is set to zero. GEXP2.: Zero**zero is indeterminate, result = zero Exponential and Exponentiation Routines 4-21 CEXP2. Description The CEXP2. routine raises a complex, single-precision, floating-point number to the power of an integer. That is: CEXP2.(z,n) = zn Routines Called CEXP2. calls the CDLOG, DLOG, DSIN, DCOS, DEXP, and MTHERR routines. Type of Arguments There are two arguments. The base must be a complex, single-precision, floating-point value, and the exponent must be an integer. They can be any such values. Type of Result The result returned is a complex, single-precision, floating-point value; it may be any such value. Accuracy of Result test interval: .50000 through 1.0000 for Z (real) .50000 through 1.0000 for z (imaginary) -10 through 20 for n MRE: 7,45x10-9 (27.0 bits) real 7,45x1o-9 (27.0 bits) imaginary RMS: 3.17x10- 9 (28.2 bits) real 3.16x1o-9 (28.2 bits) imaginary LSB error distribution: -2 0% 0% -1 0% 0% o +1 100% 0% 100% 0% +2 0% real 0% imaginary When the ratio of the imaginary part of the base to the real part is less than -1010, one part of the result is less accurate. Which part is less accurate depends on the exponent. For example: test interval: LSB error distribution: test interval: LSB error distribution: -1.00000x10- 10 thro\Jgh -1.00000x10- 15 for z (real) -2.0000 through -1.0000 for z (imaginary) -1 for n -2 0% 0% -1 6% 0% 0 +1 65% 28% 100% 0% +2 2% real 0% imaginary -1.00000x1O- lO through -1.00000x1o- 15 for z (real) -2.0000 through -1.0000 for z (imaginary) 2 for n -2 0% 6% -1 0% 27% 0 +1 100% 0% 60% 8% 4-22 TOPS-10/TOPS-20 Common Math Library Reference Manual +2 0% real 0% imaginary Algorithm Used CEXP2.(z,n) is calculated as follows. Let Z = x+i·y First the routine checks for the special cases shown in the following table. Special Cases for CEXP2. Value of x Value of y any value any value 0.0 0.0 <0 0.0 0.0 0 (0.0,0.0) 0.0 0.0 >0 (O.O,Q.O) 0 (1.0,0.0) not both 0.0 Value of n Result x+i .y (+infinity, +infinity) If none of the special cases applies, the routine continues calculations as follows. The CEXP2. function is evaluated as the complex exponential of n • (LNRHO + i -THETA). LNRHO is the real part of: loge(x+i -y) THETA is the imaginary part of: loge(x+i -y) The real part of n-(LNRHO+i·THETA) is: ALPHA = n-LNRHO and the imaginary part is: PHI = n-THETA Since it is ultimately ei·PHI that is needed, it would appear that sin(PHI) and cos(PHI) are needed. However, these functions will be multiplied by eALPHA, and the handling of exception boundaries on the product will be expedited by use of 10ge(sin(PHI)) and 10ge(cos(PHI)), which will be added to ALPHA before the call to the DEXP function. The absolute values of sin(PHI) and cos(PHI) are used as arguments of the CDLOG function; the signs of sin(PHI) and cos(PHI) are stored for use in determining the signs for the real and imaginary parts of the complex exponential, CEXP. The real part of the final result is: sgn(cos(PHI)) _eALPHA+loge(1 cos(PHI)I) The imaginary part of the final result is: sgn(sin(PHI)) -eALPHA+loge(1 sin(PHI)I) Exponential and Exponentiation Routines 4-23 Error Conditions The following error messages are returned for error conditions detected during the check for the special cases shown above. Other errors detected will result in error messages relating to the CEXP3. routine because CEXP2. is part of the CEXP3. routine. 1. If both the real and iraaginary parts of the argument are zero and the exponent is also zero, the following message is issued and the result is set to (0.0,0.0). CEXP2.: Zero··zero is indeterminate, result = zero 2. If both the real and imaginary parts of the argument are zero and the exponent is negative, the following message is issued and the result is set to (infinity, infinity). CEXP2.: Zero·· negative exponent, result = infinity 3. If PHI ~ 6746518852, argument reduction for sin/cos is impossible so the following message is issued and the result is set to (+infinity, +infinity). CEXP2.: Both parts indeterminate 4. If the base and/or the exponent are such that one or both parts of the result overflow, one of the following messages is issued and the corresponding result is set to ± infinity. CEXP2.: Real part overflow CEXP2.: Imaginary part overflow CEXP2.: Both parts ove~flow 5. If the base and/or the exponent are such that one or both parts of the result underflows, one of the following messages is issued and the corresponding result is set to 0.0. CEXP2.: Real part underflow CEXP2.: Imaginary part underflow CEXP2.: Both parts underflow 4-24 TOPS-10/TOPS-20 Common Math Library Reference Manual EXP3. Description The EXP3. routine raises a single-precision, floating-point number to the power of another single-precision, floating-point number. That is: EXP3.(x,y) = xY Routines Called EXP3. calls the MTHERR routine. Type of Arguments There are two arguments; both must be single-precision, floating-point values. The base must not be less than zero unless the exponent is an integer. The base must not be equal to zero unless the exponent is greater than zero. Type of Result The result returned is a single-precision, floating-point value in the range 2- 129 to 2127. Accuracy of Result test Interval MRE RMS 5.1 1.52xlO-B (26.0 bits) 4.70x1o-9 (27.7 bits) .50000 through 1.0000 -10.1 1.86xHrB (25.7 bits) 4.92x1o- 9 (27.6 bits) .50000 through 1.0000 15.1 2.27x1O-B (25.4 bits) 5.42x1o-9 (27.5 bits) .50000 through 1.0000 -20.1 3.14x1o-B (24.9 bits) 6.05x1o-9 (27.3 bits) .50000 through 1.0000 30.1 3.90x1o--8 (24.6 bits) 7.32x1O-9 (27.0 bits) .50000 through 1.0000 -50.1 6.18x1o--8 (23.9 bits) 1.07x1o-8 (26.5 bits) .50000 through 1.0000 80.1 9.04x1o-- 8 (23.4 bits) 1.60x1o-8 (25.9 bits) 9.04x1o-B (23.4 bits) 8.74xlo-9 (26.8 bits) x y .50000 through 1.0000 total LSB error distribution according to the value of Y 5.1 -4+ 0% -3 0% -2 0% -1 12% 0 74% +1 14% +2 0% +3 0% +4+ 0% Y = -10.1 0% 0% 0% 11% 70% 19% 0% 0% 0% 15.1 0% 0% 0% 18% 66% 16% 0% 0% 0% Y = -20.1 0% 0% 0% 1411() 61% 24% 1% 0% 0% Y= 30.1 0% 0% 3% 21% 56% 18% 1% 0% 0% Y = -50.1 0% 0% 3% 17% 46% 23% 7% 2% 1% Y= 80.1 4% 4% 9% 19% 36% 19% 6% 2% 1% total 1% 1% 2% 16% 58% 19% 2% 1% 0% Y= Y= Exponential and Exponentiation Routi nes 4-25 Algorithm Used EXP3. (x,y) is calculated as follows. First the routine checks for the special cases shown in the following table. Special Cases for EXP3. Value of x Value of y Result 0.0 > 0.0 0.0 0.0 0.0 0.0 0.0 <0.0 infinity *0.0 0.0 1.0 <0.0 odd integer <0.0 <0.0 even integer > 0.0 <0.0 not integer (-x)Y Otherwise xY = 2W w = y ·log2(x) log2(x) is calculated as follows: x = 2m ·f where .5 $ f < 1.0 Let p be an odd integer < 16 and let a = 2- p/ 16 Then select p to minimize la-fl now x = 2m ·a·(f/a) Then log2(x) = m+log2(a)+log2(f/a) or log2(x) = m-p/16+log2(f/a) Let ul = m-p/16 and u2 = log2(f/a) = log2( (1 +s)/(1-s)) Then log2(x) = ul+u2 and s = (f-a)/(f+a) A rational approximation is used to evaluate u2; ul and u2 are then used to determine wI and w2. w = y·log2(x) = wl+w2 and wI = FLOAT(INT(w·16.0))/16.0 = ml+pl/16 ml and pI are integers with 0 $ pI $ 15 Finally If -129 $ w < 127 EXP3.(x,y) = xY = 2W is reconstructed as: EXP3.(x,y) = 2w1 ·2w2 2w1 is evaluated by table lookup and 2w2 is evaluated from another rational approximation. 4-26 TOPS-10/TOPS-20 Common Math Library Reference Manual Error Conditions 1. If the base is a negative value and the exponent is not an integer, the following message is issued and the calculation proceeds using the absolute value of the base. EXP3.: Negative base**non-integer; ABS(base) used 2. If the base is 0.0 and the exponent is negative, the following message is issued and the result is set to infinity. EXP3.: Zero**negative exponent; result = infinity 3. If both the base and the exponent are 0.0, the following message is issued and the result is set to 0.0. EXP3.: Zero**zero is indeterminate; result = zero 4. If y -log2(x) ;;::: 127, the result overflows. Then the following message is issued and the result is set to -infinity if x is less than 0.0 and y is an odd integer. Otherwise, the result is set to +infinity. EXP3.: Result overflow 5. If y·log2(x) < -129, the result underflows. Then the following message is issued and the result is set to 0.0. EXP3.: Result underflow Exponential and Exponentiation Routines 4-27 DEXP3. Description The DEXP3. routine raises a double-precision, D-floating-point number to the power of another double-precision, D-floating-point number. That is: DEXP3.(x,Y) = xY Routines Called DEXP3. calls the MTHERR routine. Type of Argument There are two arguments; both must be double-precision, D-floating-point values. The base must not be less than zero unless the exponent is an integer. The base must not be equal to zero unless the exponent is greater than zero. Type of Result The result returned is a double-precision, D-floating-point value greater than or equal to 2- 129 and less than or equal to 2127. Accuracy of Result test Interval x y .50000 through 1.0000 MRE RMS 5.1 5.23xlO- 19 (60.7 bits) 1.45x1(f19 (62.6 bits) .50000 through 1.0000 -10.1 5.50xHr 19 (60.7 bits) 1.46xlo- 19 (62.6 bits) .50000 through 1.0000 20.1 9.07x1o- 19 (59.9 bits) 1.B4xHt- 19 (62.2 bits) .50000 through 1.0000 -50.1 1.97x1o- 18 (5B.B bits) 3.27xHt-19 (61.4 bits) .50000 through 1.0000 BO.1 3.02x1o- 18 (5B.2 bits) 5.10xlo- 19 (60.B bits) 3.02x1o-18 (5B.2 bits) 2.9Bxlo-19 (61.5 bits) total LSB error distribution according to the value of y -4+ 0% -3 5.1 0% -2 0% -1 7% 0 73% +1 20% +2 0% +3 0% +4+ 0% y = -10.1 0% 0% 0% 13% 70% 17% 0% 0% 0% Y= 20.1 0% 0% 0% 11% 63% 25% 1% 0% 0% Y = -50.1 1% 2% 6% 19% 46% 21% 4% 1% 0% Y = -BO.1 1% 2% 5% 16% 35% 22% 10% 5% 5% total 0% 1% 2% 13% 57% 21% 3% 1% 1% Y= Algorithm Used DEXP3.(x,y) is calculated as follows. First the routine checks for the special cases shown in the following table. 4-28 TOPS-10/TOPS-20 Common Math Library Reference Manual Special Cases for DEXP3. Value of x Value of y Result 0.0 > 0.0 0.0 0.0 0.0 0.0 0.0 <0.0 infinity =1=0.0 0.0 1.0 < 0.0 odd integer <0.0 <0.0 even integer > 0.0 <0.0 not integer (-x)Y Otherwise xY = 2W w = y ·log2(x) log2(x) is calculated as follows: x = 2m ·f where .5 ~ f < 1.0 Let p be an odd integer < 16 and let a = 2- p/ 16 Then select p to minimize la-fl now x = 2me a·(f/a) Then log2(x) = m+log2(a)+log2(f/a) or log2(x) = m-p/16+log 2(f/a) Let u1 = m-p/16 and u2 = log2(f/a) = log2( (1 +s)/(1-s» Then log2(x) = u1 +u2 and s = (f-a)/(f+a) A rational approximation is used to evaluate u2; u1 and u2 are then used to determine wI and w2. w = y·log2(x) = w1+ w2 and wI = FLOAT(INT(w·16.0»/16.0 = ml+p1/16 m1 and pI are integers with 0:::; pI ~ 15 Finally If -129 ~ w < 127 DEXP3.(x,y) = xY = 2 is reconstructed as: DEXP3.(x,y) = 2w1 ·2w2 2w1 is evaluated by table lookup and 2w2 is evaluated from another rational approximation. W Exponential and Exponentiation Routines 4-29 Error Conditions 1. If the base is a negative value and the exponent is not an integer, the following message is issued and the calculation proceeds using the absolute value of the base. DEXP3.: Negative base**non-integer; ABS(base) used 2. If the base is 0.0 and the exponent is negative, the following message is issued and the result is set to infinity. DEXP3.: Zero**negative exponent; result = infinity 3. If both the base and the exponent are 0.0, the following message is issued and the result is set to 0.0. DEXP3.: Zero**zero is indeterminate; result = zero 4. If y ·log2(x) ~ 127, the result overflows. Then the following message is issued and the result is set to -infinity if x is less than 0.0 and y is an odd integer. Otherwise, the result is set to +infinity. DEXP3.: Result overflow 5. If y ·log2(x) < -129, the result underflows. Then the following message is issued and the result is set to 0.0. DEXP3.: Result underflow 4-30 TOPS-10/TOPS-20 Common Math Library Reference Manual GEXP3. Description The GEXP3. routine raises a double-precision, G-floating-point number to the power of another double-precision, G-floating-point number. That is: GEXP3.(x,y) = xY Routines Called GEXP3. calls the MTHERR routine. Type of Arguments There are two arguments; both must be double-precision, G-floating-point values. The base must not be less than zero unless the exponent is an integer. The base must not be equal to zero unless the exponent is greater than zero. Type of Result The result returned is a double-precision, G-floating-point value in the range 2- 1025 to 21023 . Accuracy of Result test Interval x y .50000 through 1.0000 MRE RMS 5.10 3.69xlO- 18 (57.9 bits) 1.18xHr 18 (59.6 bits) .50000 through 1.0000 -10.10 4.91x1o- 18 (57.5 bits) 1.22x1Q-18 (59.5 bits) .50000 through 1.0000 20.10 7.92x1o- 18 (56.8 bits) 1.49x1Q-18 (59.2 bits) .50000 through 1.0000 -50.10 1.46x1o- 17 (55.9 bits) 2.70x1Q-18 (58.4 bits) .50000 through 1.0000 80.10 2.17x1o-17 (55.4 bits) 4.13x1o- 18 (57.7 bits) 2.17x1o-17 (55.4 bits) 2.43x1Q-18 (58.5 bits) total LSB error distribution according to the value of Y 5.10 -4+ 0% -3 0% -2 0% -1 14% 0 70% +1 16% +2 0% +3 0% +4+ 0% Y = --10.10 0% 0% 0% 12% 68% 20% 0% 0% 0% Y= 20.10 0% 0% 1% 19% 60% 19% 1% 0% 0% Y = -50.10 0% 1% 4% 17% 43% 24% 7% 2% 1% Y= 80.10 4% 5% 8% 18% 34% 19% 7% 3% 2% total 1% 1% 3% 16% 55% 20% 3% 1% 1% Y= Algorithm Used GEXP3.(x,y) is calculated as follows. First the routine checks for the special cases shown in the following table. Exponential and Exponentiation Routines 4-31 Special Cases for GEXP3. Value of x Value of y Result 0.0 > 0.0 0.0 0.0 0.0 0.0 0.0 <0.0 infinity *0.0 0.0 1.0 <0.0 odd integer <0.0 < 0.0 even integer > 0.0 <0.0 not integer (-x)Y Otherwise x Y = 2W w = y -log2(x) log2(x) is calculated as follows: x = 2m -f where .5:5; f < 1.0 Let p be an odd integer < 16 and let a = 2- p/ 16 Then select p to minimize la-fl now x = 2m -a-(f/a) Then log2(x) = m+log2(a)+log2(f/a) or log2(x) = m-p/16+log 2(f/a) Let ul = m-p/16 and u2 = log2(f/a) = log2«(1+s)/(I-s)) Then log2(x) = ul+u2 and s = (f-a)/(f+a) A rational approximation is used to evaluate u2; ul and u2 are then used to determine wI and w2. w = y-Iog2(x) = wl+w2 and wI = FLOAT(INT(w-16.0))/16.0 = ml+pl/16 ml and pI are integers with 0:5; pI :5; 15 Finally If -1025 :5; w < 1023 GEXP3.(x,y) = xY = 2W is reconstructed as: GEXP3.(x,y) = 2wl _2w2 2w1 is evaluated by table lookup and 2w2 is evaluated from another rational approximation. 4-32 TOPS-10/TOPS-20 Common Math Library Reference Manual Error Conditions 1. If the base is a negative value and the exponent is not an integer, the following message is issued and the calculation proceeds using the absolute value of the base. GEXP3.: Negative base**non-integer; ABS(base) used 2. If the base is 0.0 and the exponent is negative, the following message is issued and the result is set to infinity. GEXP3.: Zero**negative exponent; result = infinity 3. If both the base and the exponent are 0.0, the following message is issued and the result is set to 0.0. GEXP3.: Zero**zero is indeterminate, result = zero 4. If y·log2(x) ~ 1023, the result overflows, the following message is issued, and the result is set to -infinity if x less than 0.0 and y is an odd integer. Otherwise, the result is set to +infinity. GEXP3.: Result overflow 5. If y·log2(x) < -1025, the result underflows, the following message is issued, and the result is set to 0.0. GEXP3.: Result underflow Exponential and Exponentiation Routines 4-33 CEXP3. Description The CEXP3. routine raises a complex, single-precision, floating-point number to the power of another complex, single-precision, floating-point number. That is: CEXP3.(z,g) = zg Routines Called CEXP3. calls the CDLOG, DLOG, DSIN, DCOS, DEXP, and MTHERR routines. Type of Arguments There are two arguments; both must be complex, single-precision, floatingpoint values. They can be any such values. Type of Result The result returned is a complex, single-precision, floating-point value. It may be any such value. Accuracy of Result test interval: .50000 through 1.0000 for z (real) .50000 through 1.0000 for z (imaginary) -100.00 through 207.00 for g (real) -163.00 through 7.00 for g (imaginary) MRE: 7.45xlO-9 (27.0 bits) real 7.45x10-9 (27.0 bits) imaginary RMS: :3.17xlO-9 (28.2 bits) real 3.17x10-9 (28.2 bits) imaginary LSB error distribution: -2 0% 0% -1 0% 0% o +1 100% 0% 100% 0% +2 0% real 0% imaginary When the ratio of the imaginary part of the base to the real part is less than -1010, one part of the result is less accurate. Which part is less accurate depends on the exponent. For example: test interval: LSB error distribution: test interval: LSB error distribution: -1.00000x10- 10 through -1.00000x10- 15 for z (real) -2.0000 through -1.0000 for z (imaginary) (-1,0) for g -2 0% 0% -1 6% 0% 0 +1 65% 28% 100% 0% +2 2% real 0% imaginary -1.00000x10- 10 through -1.00000x10- 15 for z (real) -2.0000 through -1.0000 for z (imaginary) (2,0) for g -2 0% 6% -1 0% 27% 0 +1 100% 0% 60% 8% 4-34 TOPS-10/TOPS-20 Common Math Library Reference Manual +2 0% real 0% imaginary Algorithm Used CEXP3. (z,g) is calculated as follows. Let z = x+i·y g = a+i·b First the routine checks for the special cases shown in the following table. Special Cases for CEXP3. Value of x 0.0 0.0 0.0 Value of y 0.0 0.0 0.0 Value of a Result > 0.0 (0.0,0.0) ~O.O 0.0 (+infinity, +infinity) (0.0,0.0) If none of the special cases applies, the routine continues calculation as follows. If x and y#:O x+i·y is rewritten as e1oge(X+i'Y) The CEXP3. function is evaluated as the complex exponential of (a+i· b) ·(LNRHO+i·THETA). LNRHO is the real part of: loge(x+i .y) THETA is the imaginary part of: loge(x+i -y) The real part of (a+i-b)-(LNRHO+i-THETA) is: ALPHA = a-LNRHO-b-THETA and the imaginary part is: PHI = a-THETA+b-LNRHO Since it is ultimately ei · PHI that is needed, it would appear that sin(PHI) and cos(PHI) are needed. However, these functions will be multiplied by eALPHA, and the handling of exception boundaries on the product will be expedited by use of loge(sin(PHI) and loge(cos(PHI), which will be added to ALPHA before the call to the DEXP function. The absolute values of sin (PHI) and cos(PHI) are used as arguments of the CDLOG function; the signs of sin(PHI) and cos(PHI) are stored for use in determining the signs for the real and imaginary parts of the complex exponential, CEXP. The real part of the final result is: sgn( cos(PHI) -eALPHA+loge(l cos(PHI)I) The imaginary part of the final result is: sgn(sin(PHI» -eALPHA+loge(i sin(PHI)I) Exponential and Exponentiation Routines 4-35 Error Conditions 1. If both the real and imaginary parts of both arguments are 0.0, the follow- ing message is issued and the result is set to (0.0,0.0). CEXP3.: Zero**zero is indeterminate; result = zero 2. If both the real and imaginary parts of the base are zero and the real part of the exponent is negative, the following message is issued and the result is set to (+infinity,+infinity). CEXP3.: Zero**(negative,non-zero) is indeterminate, result = (infinity,infinity) 3. If PHI ~ 6746518852, argument reduction for sin/cos is impossible so the following message is issued and the result is set to (+infinity,+infinity). CEXP3.: Both parts indeterminate 4. If the base and/or the exponent are such that one or both parts of the result overflow, one of the following messages is issued and the corresponding result is set to ± infinity. CEXP3.: Real part overflow CEXP3.: Imaginary part overflow CEXP3.: Both parts overflow 5. If the base and/or the exponent are such that one or both parts of the result underflows, one of the following messages is issued and the corresponding result is set to (0.0). CEXP3.: Real part underflow CEXP3.: Imaginary part underflow CEXP3.: Real and imaginary parts underflow 4-36 TOPS-10/TOPS-20 Common Math Library Reference Manual Chapter 5 Trigonometric Routines SIN Description The SIN routine calculates the single-precision, floating-point sine of the single-precision, floating-point angle given' in radians as the argument. That is: SIN (x) = sin(x) Routines Called SIN calls the MTHERR routine. Type of Argument The argument nlust be a single-precision, floating-point value less than or equal to 210828714. Type of Result The result returned is a single-precision, floating-point value in the range -1.0 to 1.0. Accuracy of Result test interval: -10.000 through 201.06 MRE: 1.95x10-8 (25.6 bits) RMS: 3.87x10-9 (27.9 bits) LSB error distribution: -2 0% -1 12% 0 78% +1 10% +2 0% Algorithm Used SIN(x) is calculated as follows. Note that SIN(x) = -SIN(-x). Let Ixl = 7r en+f If I < 7r/2 The argument reduction is as follows. n = the nearest integer to Ixl/1r Then the reduced argument is: f = Ixl-7r en If If I < 863167530x10-4 sin(f) = f Otherwise sin(f) = f+feR(g) g=(2 R(g) = ((((r5 eg+r4) eg+r3) eg+r2) -g+r1)-g r1 = -.166666666 r2 = .833333072x10-2 r3 = -.198408328x10-3 r4 = .275239711x10-5 r5 = -.238683464x10- 7 Finally SIN(x) = sgn(x) -(-l)n esin(f) Trigonometric Routines 5-3 Error Conditions If the absolute value of the argument is greater than 210828714, the following message is issued and the result is set to 0.0. SIN: ABS(arg) too large; result = zero 5-4 TOPS-10/TOPS-20 Common Math Library Reference Manual SIND Description The SIND routine calculates the single-precision, floating-point sine of the single-precision, floating-point angle given in degrees as the argument. That IS: SIND(x) = sin (x) Routines Called SIND calls the MTHERR routine. Type of Argument The argument must be a single-precision, floating-point value less than or equal to 47185919. Type of Result The result returned is a single-precision, floating-point value in the range -1.0 to 1.0. Accuracy of Result test interval: -1000.0 through 3600.0 MRE: 1.95x10-8 (25.6 bits) RMS: 4.11x10- 9 (27.9 bits) LSB error distribution: -2 -1 o 0% 13% 73% +1 14% +2 0% Algorithm Used SIND(x) is calculated as follows. Note that SIND(x) = -SIND( -x). Let Ixl = 180-n+f If I ::; 90 The argument reduction is as follows. n = the nearest integer to Ixl/180 Then the reduced argument, converted to radians is: f = (Ix 1-180· n)· (71"/180) If IfI < 863167530x10-- 4 sin(f) = f Otherwise sin(f) = f+f·R(g) g = f2 R(g) = (( ((r5· g+r4)· g+r3)· g+r2)· g+r1)·g r1 = -.166666666 r2 = .833333072x10- 2 r3 = -.198408328x10-3 r4 = .275239711x1Q-5 r5 = -.238683464x10-7 Finally SIND(x) = sgn(x)· (-l)n· s in(f) Trigonometric Routines 5-5 Error Conditions If the absolute value of the argument is greater than 47185919, the following message is issued and the result is set to 0.0. SIND: ABS(arg) too large; result = zero 5-6 TOPS-10/TOPS-20 Common Math Library Reference Manual cos Description The COS routine calculates the single-precision, floating-point cosine of the single-precision, floating-point angle given in radians as the argument. That is: COS(x) = cos(x) Routines Called COS calls the MTHERR routine. Type of Argument The argument must be a single-precision, floating-point value less than 210828714. Type of Result The result returned is a single-precision, floating-point value in the range -1.0 to 1.0. Accuracy of Result test interval: -10.000 through 201.06 MRE: 1.86x10-8 (25.7 bits) RMS: 4.26x10-9 (27.8 bits) LSB error distribution: -2 0% -1 12% 0 70% +1 17% +2 0% Algorithm Used COS(x) is calculated as follows. Note that COS(x) = COS(-x). Let Ixl = 1I" en+f If I < 11"/2 The argument reduction is as follows. n = .5 + the nearest integer to Ixl/1I" Then the reduced argument is: f = Ixl-1I" en If If I < .863167530x10-4 sin(f) = f Otherwise sin(f) = f +f eR(g) g = f2 R(g) = ««r5 eg+r4) eg+r3) eg+r2) eg+r1) eg r1 = -.166666666 r2 = .833333072x10- 2 r3 = -.198408328x10-3 r4 = .275239711x10-5 r5 = -.238683464x10- 7 Finally COS (x) = (-1)n+ 1e sin(f) Trigonometric Routines • 5-7 Error Conditions If the absolute value of the argument is greater than or equal to 210828714, the following message is issued and the result is set to 0.0. COS: ABS(arg) too large; result = zero 5-8 TOPS-10/TOPS-20 Common Math Library Reference Manual coso Description The COSD routine calculates the single-precision, floating-point cosine of the single-precision, floating-point angle given in degrees as the argument. That IS: COSD(x) = cos(x) Routines Called COSD calls the MTHERR routine. Type of Argument The argument must be a single-precision, floating-point value less than 47185919. Type of Result The result returned is a single-precision, floating-point value in the range -1.0 to 1.0. Accuracy of Result test interval: -1000.0 through 3600.0 MRE: 1. 75x10-8 (25.8 bits) RMS: 4.20x10- 9 (27.8 bits) LSB error distribution: -2 0% -1 o 12% 72% +1 16% +2 0% Algorithm Used COSD(x) is calculated as follows. Note that COSD(x) = COSD(-x). Let Ixl = 180-n+f If I ~ 90 The argument reduction is: n = .5+ the nearest integer to Ixl/180 Then the reduced argument, converted to radians, is: f = (lxl-180-n) -(11"/180) If If I < .863167530x10-4 sin(f) = f Otherwise sin(f) = f +f -R(g) g = f2 R(g) = ««r5-g+r4)-g+r3)-g+r2)-g+r1)-g r1 = -.166666666 r2 = . 833333072x 10-2 r3 = -.198408328x10-3 r4 = .275239711x10-5 r5 = -.238683464x10-7 Finally COSD(x) = (-1)n+ 1e sin(f) Trigonometric Routi nes 5-9 Error Conditions If the absolute value of the argument is greater than or equal to 47185919, the following message is issued and the result is set to 0.0. COSO: ABS(arg) too large; result = zero 5-10 TOPS-10/TOPS-20 Common Math Library Reference Manual DSIN Description The DSIN routine calculates the double-precision, D-floating-point sine of the double-precision, D-floating-point angle given in radians as the argument. That is: DSIN(x) = sin (x) Routines Called DSIN calls the MTHERR routine. Type of Argument The argument must be a double-precision, D-floating-point value less than or equal to 6746518852 (or 231e 1l"). Type of Result The result returned is a double-precision, D-floating-point value in the range -1.0 to 1.0. Accuracy of Result test interval: -10.000 through 201.06 MRE: 6.06x10- 19 (60.5 bits) RMS: 1.35x10- 19 (62.7 bits) LSB error distribution: -2 0% -1 o 22% 68% +1 10% +2 0% Algorithm Used DSIN(x) is calculated as follows. Note that DSIN(x) = -DSIN(-x). Let Ixl = 1I" e n+f If I < 11"/2 The argument reduction is as follows. f = ((lxl-n e c1)-n e c2)-n e c3 c1 = high-order 34 bits of 11" c2 = next 31 bits of1l" c3 = next 62 bits of 11" If If I < 2- 31 sin(f) = f Trigonometric Routines 5-11 Otherwise sin(f) = f +f -R(g) g=(2 R(g) = (g-XNUM/XDEN+rpl)eg XNUM = «rp5 eg+rp4)eg+rp3)eg+rp2 XDEN = «g·q2)eg+ql)eg+qO rpl = -.166666666666666667 rp2 = .451456904704461990x1Of rp3 = -.489487151969463797x1Gr rp4 = .428183075897778265x10 rp5 = -.121560740596710190x101 qO = .541748285645351853xl07 q1 = .702492288221842518xlO'> q2 = .394924723520450141x1Gr Finally DSIN(x) = sgn(x)·(-l)n es in(f) Error Conditions If the absolute value of the argument is greater than 6746518850, the following message is issued and the result is set to 0.0. DSIN: ABS(arg) too large; result = zero 5-12 TOPS-10/TOPS-20 Common Math Library Reference Manual DCOS Description The DCOS routine calculates the double-precision, D-floating-point cosine of the double-precision, D-floating-point angle given in radians as the argument. That is: DCOS(x} = cos(x} Routines Called DCOS calls the MTHERR routine. Type of Argument The argument must be a double-precision, D-floating-point value less than 6746518852 (or 231 e 7r). Type of Result The result returned is a double-precision, D-floating-point value in the range -1.0 to 1.0. Accuracy of Result test interval: -10.000 through 201.06 MRE: 4.96x10- 19 (60.8 bits) RMS: 1.41x10- 19 (62.6 bits) LSB error distribution: -2 0% -1 o 16% 66% +1 18% +2 0% Algorithm Used DCOS(x} is calculated as follows. Note that DCOS(x} = DCOS(-x}. Let Ixl = 7r e n+f IfI< 7r/2 The argument reduction is as follows. f = (lxl-n e c1}-n e c2}-n a c3 c1 = high-order 34 bits of 7r c2 = next 31 bits of 7r c3 = next 62 bits of 7r If IfI < 2-31 sin(f) = f Trigonometric Routines 5-13 Otherwise sin(f) = f+feU(g) g= f 2 R(g) = (g-XNUM/XDEN+rp1)·g XNUM = «rp5-g+rp4) -g+rp3) eg+rp2 XDEN = «g·q2)eg+q1)-g+qO rp1 = .166666666666666667 rp2 = .451456904704461990x1Gr rp3 = -.489487151969463797x103 rp4 = .428183075897778265x10 rp5 = -.121560740596710190x1~1 qO = .541748285645351853x107 q1 = .702492288221842518x101) q2 = .394924723520450141x103 Finally DCOS(x) = (_1)n+l_ sin(f) Error Conditions If the absolute value of the argument is greater than or equal to 6746518852, the following Inessage is issued and the result is set to 0.0. DCOS: ABS(arg) too large; result = zero 5-14 TOPS-10/TOPS-20 Common Math Library Reference Manual GSIN Description The GSIN routine calculates the double-precision, G-floating-point sine of the double-precision, G-floating-point angle given in radians as the argument. That is, GSIN (x) = sin (x) Routines Called GSIN calls the MTHERR routine. Type of Argument The argument must be a double-precision, G-floating-point value less than or equal to 1686629713 (or 229 _1r). Type of Result The result returned is a double-precision, G-floating-point value in the range -1.0 to 1.0. Accuracy of Result test interval: -10.000 through 201.06 MRE: 3.30x10- 18 (58.1 bits) RMS: 8.85x10- 19 (60.0 bits) LSB error distribution: -2 0% -1 13% o 78% +1 9% +2 0% Algorithm Used GSIN(x) is calculated as follows. Note that GSIN(x) = -GSIN(-x). Let Ixl = 1r-n+f IfI < 1r/2 The argument reduction is as follows. f = ((lxl-n e c1)-n-c2)-n e c3 c1 = high-order 30 bits of 1r c2 = next 28 bits of 1r c3 = next 62 bits of 1r If Ifl < 2-30 sin(f) = f Trigonometric Routines 5-15 Otherwise sin(f) = f+fe R(g) g = f2 R(g) = (g-XNUM/XDEN+rp1)eg XNUM = «rp5 eg+rp4) eg+rp3) eg+rp2 XDEN = «geq2)eg+q1)eg-qO rp 1 = - .166666666666666667 rp2 = .451456904704461990x1of rp3 = -.489487151969463797x103 rp4 = .428183075897778265x101 rp5 = -.121560740596710190x10- 1 qO = .541748285645351853x107 ql = .702492288221842518x105 q2 = .394924723520450141x1Gr Finally GSIN(x) = sgn(x)e(-1) ne sin(f) Error Conditions If the absolute value of the argument is greater than 1686629713, the following message is issued and the result is set to 0.0. GSIN: ABS(arg) too large; result = zero 5-16 TOPS-10/TOPS-20 Common Math Library Reference Manual Geos Description The GCOS routine calculates the double-precision, G-floating-point cosine of the double-precision, G-floating-point angle given in radians as the argument. That is: GCOS(x) = cos(x) Routine Called GCOS calls the MTHERR routine. Type of Argument The argument must be a double-precision, G-floating-point value less than 1686629713 (or 22g e7r ). Type of Result The result returned is a double-precision, G-floating-point value in the range -1.0 to 1.0. Accuracy of Result test interval: -10.000 through 201.06 MRE: 3.44x10- 18 (58.0 bits) RMS: 9.84x10- 19 (59.8 bits) LSB error distribution: -2 0% -1 14% o 72% +1 15% +2 0% Algorithm Used GCOS(x) is calculated as follows. Note that GCOS(x) = GCOS(-x). Let Ixl = 7r-n+f Ifl < 7r/2 The argument reduction is as follows. f = «lxl-n ec1)-n ec2)-n ec3 c1 = high-order 30 bits of 7r c2 = next 28 bits of 7r c3 = next 62 bits of 7r If If I < 2- 30 sin(f) = f Trigonometric Routines 5-17 Otherwise sin(f) = f+f- R(g) g = f2 R(g) = (g-XNUM/XDEN+rp1)-g XNUM = «rp5-g+rp4)-g+rp3)-g+rp2 XDEN = «g-q2)-g+q1)-g+qO rp1 = -.166666666666666667 rp2 = .451456904704461990x1Gr rp3 = -.489487151969463797x1OS rp4 = .428183075897778265x101 rp5 = -.121560740596710190x1~1 qO = .541748285645351853x107 q1 = .702492288221842518x1Gr q2 = .394924723520450141x1OS Finally GCOS(x) = (_1)n+1- sin(f) Error Conditions If the absolute value of the argument is greater than or equal to 1686629713, the following message is issued and the result is set to 0.0. GCOS: ABS(arg) too large; result = zero 5-18 TOPS-10/TOPS-20 Common Math Library Reference Manual CSIN Description The CSIN routine calculates the complex, single-precision, floating-point sine of the complex, single-precision, floating-point angle given in radians as the argument. That is: CSIN(z) = sin(z) Routines Called CSIN calls the SIN, COS, EXP, ALOG, and MTHERR routines. Type of Argument The argument must be a complex, single-precision, floating-point value, the real part of which must be less than 210828714 (or 226e 7r). Type of Result The result returned is a complex, single-precision, floating-point value; it may be any such value. Accuracy of Result test interval: -200.00 through 200.00 real -10.000 through 10.000 i.maginary MRE: 3.30x10-8 (24.9 bits) real 3.44x10-8 (24.8 bits) imaginary RMS: 7.68x10- 9 (27.0 bits) real 6.75x10-9 (27.1 bits) imaginary LSB error distribution: -2 2% 1% -1 23% 19% 0 51% 57% +1 22% 22% +2 2% real 1% imaginary Algorithm Used CSIN(z) is calculated as follows. Let z = x+i·y If Ixl > 210828714 CSIN(z) = (0.0,0.0) If Iyl > 88.029692, calculation proceeds as follows. For the real part of the result: Let t = Isin(x)1 If t = 0.0 x = 0.0 If loge(t)+lyl > 88.722839 x = ±machine infinity (88.722839 = 88.029692+1oge(2)) Trigonometric Routines 5-19 For the imaginary part of the result: Let t = Icos(x)I*O If loge(t)+lyl < 88.722839 y = ± infinity Otherwise CSIN(z) = sin(x) ·cosh(y)+i ·cos(x) ·sinh(y) Error Conditions 1. If the absolute value of the real part of the argument is greater than 210828714, the following message is issued and the result is set to (0.0,0.0). CSIN: ABS(REAL(arg)) too large; result = zero 2. If Iy I+loge{lsin(x) I) > 88.722839, the real part overflows. If lyl+loge(lcos(x» > 88.722839, the imaginary part overflows. If either part overflows, one of the following Inessages is issued and the relevant part of the result is set to ± machine infinity. CSIN: Imaginary part overflow CSiN: Real part overflow 3. If the imaginary part of the result is too small a number, the following message is Issued and the imaginary part of the result is set to 0.0. CSIN: Imaginary part underflow 5-20 TOPS-10/TOPS-20 Common Math Library Reference Manual ccos Description The CCOS routine calculates the complex, single-precision, floating-point cosine of the complex, single-precision, floating-point angle given in radians as the argument. That is: CCOS(z) = cos(z) Routines Called CCOS calls the SIN, COS, EXP, ALOG, and MTHERR routines. Type of Argument The argument must be a complex, single-precision, floating-point value, the real part of which must be less than 210828714 (or 226e 7r). Type of Result The result returned is a complex, single-precision, floating-point value; it may be any such value. Accuracy of Result test interval: -200.00 through 200.00 real -10.000 through 10.000 imaginary MRE: 3.35x10-8 (24.8 bits) real 3.57x10-8 (24.7 bits) imaginary RMS: 7.76x10-9 (26.9 bits) real 6.68x10-9 (27.2 bits) imaginary LSB error distribution: -2 2% 1% -1 20% 20CJCl o +1 50% 25Cj() 57(H, 20% +2 3% real 1% imaginary Algorithm Used CCOS(z) is calculated as follows. Let z = x+iey If Ixl >210828714 CCOS(z) = (0.0,0.0) If Iyl > 88.029692 calculation proceeds as follows. For the real part of the result: Let t = Icos(x)I:;i:O If loge (t) + Iyl > 88.722839 x = ± machine infinity (88.722839 = 88.029692+loge(2)) Trigonometric Routines 5-21 For the imaginary part of the result: Let t = Isin(x)1 If t = 0.0 y = 0.0 If loge(t)+lyl > 88.722839 y = ± machine infinity Otherwise CCOS(z) = cos(x) ·cosh(y)-i ·sin(x) ·sinh(y) Error Conditions 1. If the absolute value of the real part of the argument is greater than 210828714, the following message is issued and the result is set to (0.0,0.0). eeos: ABS(REAL(arg)) too large: result = zero 2. If Iy I+loge(lcos(x) I) > 88.722839, the real part overflows. If lyl+loge(lsin(x)l) > 88.722839, the imaginary part overflows. If either part overflows, one of the following messages is issued and the relevant part of the result is set to ± machine infinity. eeos: Imaginary part overflow eeos: Real part overflow 3. If the imaginary part of the result is too small a number, the following message is issued and the imaginary part of the result is set to 0.0. eeos: Imaginary part underflow 5-22 TOPS-10/TOPS-20 Common Math Library Reference Manual eOSIN Description The CDSIN subroutine calculates the complex, double-precision, D-floatingpoint sine of the complex, double-precision, D-floating-point angle given in radians as the argument. That is: CDSIN(z,r) = sin(z) z = location of input value r = location of result Routines Called CDSIN calls the DSIN, DCOS, DEXP, DLOG, and MTHERR routines. Type of Argument CDSIN is a subroutine that is called with two arguments. Both arguments must be two-element, double-precision vectors. The first vector (z) contains the input value; the second vector (r) will contain the result. The real part of the input value must be stored in the first element of z; the imaginary part must be stored in the second element of z. The input value must be a complex, double-precision, D-floating-point value, the real part of which must be less than 231e 7r -7r/2. Type of Result The result returned is a complex, double-precision, D-floating-point value; it may be any such value. It is returned in the second vector (r) supplied in the call. The real part of the result is returned in the first element of r; the imaginary part is returned in the second element of r. Accuracy of Result test interval: -200.00 through 200.00 real -10.000 through 10.000 imaginary 1.09x10- 18 (59.7 bits) real 9.86x10- 19 (59.8 bits) imaginary 2.22x10- 19 (62.0 bits) real RMS:2.08x10-19 (62.1 bits) imaginary MRE: LSB error distribution: -2 -1 o +1 +2 2% 2% 22% 26% 51 % 54% 23% 17% 2% real 1% imaginary Trigonometric Routines 5-23 Algorithm Used CDSIN(z) is calculated as follows. Let z = x+i-y If Ixl > 231e 7r - 7r/2 CDSIN(z) = (0.0,0.0) If Iyl > 88.029692, calculations proceed as follows. For the real part of the result: Let t = Isin(x)1 If t = 0.0 x = 0.0 If loge(t)+lyl > 88.722839 x = ± infinity (88.722839 = 88.029692 + loge (2) ) For the imaginary part of the result: Let t = Icos(x)1 *- ° If loge(t)+lyl > 88.722839 y = ± infinity Otherwise CDSIN(z) = sin(x) ecosh(y)+i ·cos(x) ·sinh(y) Error Conditions 1. If the absolute value of the real part of the argument is greater than 231 .7r - 7r/2, the following message is issued and the result is set to (0.0,0.0). COSIN: ABS(REAL(arg)) too large; result = zero 2. If Iy I+loge{lsin(x) I) > 88.722839, the real part overflows. If lyl+loge(lcos(x)1) > 88.722839, the imaginary part overflows. If either part overflows, one of the following messages is issued and the relevant part of the result is set to ± machine infinity. COSIN: ABS(lMAG(arg)) too large; REAL(result) = infinity COSIN: ABS(IMAG(arg)) too large; IMAG(result) = Infinity 3. If the imaginary part of the result is too small a number, the following message is issued and the imaginary part of the result is set to 0.0. COSIN: Imaginary part underflow 5-24 TOPS-10/TOPS-20 Common Math Library Reference Manual cocos Description The CDCOS subroutine calculates the complex, double-precision, D-floatingpoint cosine of the complex, double-precision, D-floating-point angle given in radians as the argument. That is: CDCOS(z) = cos(z) z = location of input value r = location of result Routines Called CDCOS calls the DSIN, DCOS, DEXP, DLOG, and MTHERR routines. Type of Argument CDCOS is a subroutine that is called with two arguments. Both arguments must be two-element, double-precision vectors. The first vector (z) contains the input value; the second vector (r) will contain the result. The real part of the input value must be stored in the first element of z; the imaginary part must be stored in the second element of z. The input value must be a complex, double-precision, D-floating-point value, the real part of which must be less than 231 .7r - 7r/2. Type of Result The result returned is a complex, double-precision, D-floating-point value; it may be any such value. It is returned in the second vector (r) supplied in the call. The real part of the result is returned in the first element of r; the imaginary part is returned in the second element of r. Accuracy of Result test interval: -200.00 through 200.00 real -10.000 through 10.000 imaginary MRE: 9.89x10- 19 (59.8 bits) real 9.98x10- 19 (59.8 bits) imaginary RMS: 2.25x10- 19 (61.9 bits) real 2.03x10- 19 (62.1 bits) imaginary LSB error distribution: -2 -1 o +1 +2 3% 1% 24% 21% 50% 55% 21% 21% 2% real 1% imaginary Trigonometric Routines 5-25 Algorithm Used CDCOS(z) is calculated as follows. Let z = x+iey If Ixl > 231e 1r -- 1r/2 CDCOS(z) = (0.0,0.0) If Iyl > 88.029692, calculation proceeds as follows. For the real part of the result: Let t = Icos(x) I =1= ° If loge(t)+lyl > 88.722839 x = ± infinity (88.722839 = 88.029692+loge(2)) For the imaginary part of the result: Let t = Isin(x)1 If t = 0.0 y = 0.0 If loge(t)+lyl > 88.722839 y = ± infinity Otherwise CDCOS(z) = cos(x) ·cosh(y)-i ·sin(x) esinh(y) Error Conditions 1. If the absolute value of the real part of the argument is greater than 231e 1r-1r/2, the following message is issued and the result is set to (0.0,0.0). cocos: ABS(REAL(arg)) too large; result = zero 2. If lyl+loge(lcos(x)l) > 88.722839, the real part overflows. If lyl+loge(lsin(x)l) > 88.722839, the imaginary part overflows. If either part overflows, one of the following messages is issued and the relevant part of the result is set to ± machine infinity. cocos: ABS(IMAG(arg)) too large; REAL(result) = infinity COCOS: ABS(IMAG(arg)) too large; IMAG(result) = infinity 3. If the imaginary part of the result is too small a number, the following message is issued and the imaginary part of the result is set to 0.0 cocos: Imaginary part underflow 5-26 TOPS-10/TOPS-20 Common Math Library Reference Manual CGSIN Description The CGSIN subroutine calculates the complex, double-precision, G-floatingpoint sine of the complex, double-precision, G-floating-point angle given in radians as the argument. That is, CGSIN(z,r) = sin(z) z = location of input value r = location of result Routines Called CGSIN calls the GSIN, GCOS, GEXP, GLOG, and MTHERR routines. Type of Argument CGSIN is a subroutine that is called with two arguments. Both arguments must be two-element, double-precision vectors. The first vector (z) contains the input value; the second vector (r) will contain the result. The real part of the input value must be stored in the first element of z; the imaginary part must be stored in the second element of z. The input value must be a complex, double-precision, G-floating-point value, the real part of which must be less than 229 • 7r-7r/2. Type of Result The result returned is a complex, double-precision, G-floating-point value; it may be any such value. It is returned in the second vector (r) supplied in the call. The real part of the result is returned in the first element of r; the imaginary part is returned in the second element of r. Accuracy of Result test interval: -200.00 through 200.00 real -10.000 through 10.000 imaginary MRE: 7.35xlO- 18 (56.9 bits) real 7.01xlO- 18 (57.0 bits) imaginary RMS: 1.76x10- 18 (59.0 bits) real .1.61xlO- 18 (59.1 bits) imaginary LSB error distribution: -2 -1 o 2% 1% 22% 20% 51% 55% +1 23% 22% +2 2% real 2% imaginary Trigonometric Routi nes 5-27 Algorithm Used CGSIN(z) is calculated as follows. Let z = x+iey If Ixl > 229 ·7r-7r/2 CGSIN(z) = (0.0,0.0) If Iyl > 709.089565712824, calculation proceeds as follows. For the real part of the result: Let t = Isin(x)1 If t = 0.0 x = 0.0 If loge(t)+lyl > 709.782712893384 x = ±machine infinity (709.782712893384 = 709.089565712824+1o~(2)) For the imaginary part of the result: Let t = Icos(x)1 0.0 * If loge(t)+lyl > 709.782712893384 y = ±machine infinity Otherwise CGSIN(z) = sin(x) ·cosh(x)+i ecos(x) esinh(y) Error Conditions 1. If the absolute value of the real part of the argument is greater than 22ge 7r-7r/2, the following message is issued and the result is set to (0.0,0.0). CGSIN: ABS(REAL(arg)) too large; result = zero 2. If lyl+loge(lsin(x)l) > 709.782712893384, the real part of the result will overflow. If lyl+loge(lcos(x)l) > 709.782712893384, the imaginary part of the result will overflow. Any overflowed result is set to ±machine infinity and one of the following messages is issued. CGSIN: ABS(IMAG(arg)) too large; REAL(result) = infinity CGSIN: AGS(IMAG(arg)) too large; IMAG(result) = infinity 3. If the imaginary part of the result underflows, the following message is issued and the imaginary part of the result is set to 0.0. CGSIN: Imaginary part underflow 5-28 TOPS-10/TOPS-20 Common Math Library Reference Manual CGCOS Description The CGCOS subroutine calculates the complex, double-precision, G-floatingpoint cosine of the complex, double-precision, G-floating-point angle given in radians as the argument. That is: CGCOS(z,r) = cos(z) z = location of input value r = location of result Routines Called CGCOS calls the GSIN, GeOS, GEXP, GLOG, and MTHERR routines. Type of Argument CGCOS is a subroutine that is called with two arguments. Both arguments must be two-element, double-precision vectors. The first vector (z) contains the input value; the second vector (r) will contain the result. The real part of the input value must be stored in the first element of z; the imaginary part must be stored in the second element of z. The input value must be a complex, double-precision, G-floating-point value, the real part of which must be less than 22ge 1r--7r/2. Type of Result The result returned is a complex, double-precision, G-floating-point value; it may be any such value. It is returned in the second vector (r) supplied in the call. The real part of the result is returned in the first element of r; the imaginary part is returned in the second element of r. Accuracy of Result test interval: -200.00 through 200.00 real -10.000 through 10.000 imaginary MRE: 8.31x10- 18 (56.7 bits) real 7.00x10- 18 (57.0 bits) inlaginary RMS: 1.83x10- 18 (58.9 bits) real 1.53x10- 18 (59.2 bits) imaginary LSB error distribution: --2 2% 2% -1 o +1 +2 20% 20% 50% 58% 25% 20% 3% real 1% imaginary Trigonometric Routines 5-29 Algorithm Used CGCOS(z) is calculated as follows. Let z = x+i-y If Ixl>2 29 _1I"-1I"/2 CGCOS(z) = (0.0,0.0) If Iyl > 709.089565712824, calculation proceeds as follows. For the real part of the result: Let t = Icos(x)1 =I=- 0.0 If loge(t)+lyl > 709.782712893384 x = ±machine infinity (709.782712893384 = 709.089565712824+loge(2)) For the imaginary part of the result: Let t = Isin(x)1 If t = 0.0 y = 0.0 If loge(t)+lyl > 709.782712893384 y = ±machine infinity Otherwise CGCOS(z) = cos(x) ecosh(y)-i esin(x) esinh(y) Error Conditions 1. If the absolute value of the real part of the argument is greater than 22ge 1l"-1I"/2, the following message is issued and the result is set to (0.0,0.0). CGCOS: ABS(REAL(arg)) too large; result = zero 2. If lyl+loge(lcos(x)l) > 709.782712893384, the real part of the result will overflow. If lyl+loge(lsin(x)l) > 709.782712893384, the imaginary part of the result will overflow. Any overflowed result is set to ±machine infinity and one of the following messages is issued. CGCOS: ABS(IMAG(arg)) too large; REAL(result) = infinity CGCOS: ABS(IMAG(arg)) too large; IMAG(result) = Infinity 3. If the imaginary part of the result underflows, the following message is issued and the imaginary part is set to 0.0. CGCOS: Imaginary part underflow 5-30 TOPS-10/TOPS-20 Common Math Library Reference Manual TAN Description The TAN routine calculates the single-precision, floating-point tangent of the single-precision, floating-point angle given in radians as the argument. That is: TAN(x) = tan(x) Routines Called TAN calls the MTHERR routine. Type of Argument The argument must be a single-precision, floating-point value less than or equal to 226 ·rr/2. Type of Result The result returned is a single-precision, floating-point value; it may be any such value. Accuracy of Result test interval: -10.000 through 201.06 MRE: 2.35x10-8 (25.3 bits) RMS: 5.28x10-9 (27.5 bits) LSB error distribution: -2 -1 o 0% 13% 70% +1 16% +2 0% Algorithm Used TAN(x) is calculated as follows. If Ixl > 226 • rr/2 TAN(x) = 0.0 Otherwise, the identities: tan( rr/2.0-g) = 1.0/tan(g) tan(n· rr+h) = tan(h) where -rr/2.0 < h ~ rr/2.0 tan(-x) = -tan(x) are used to reduce TAN(x) to a problem with -rr/2.0 < x ~ rr/2.0 Then nand f are defined so that: x = n ·rr/4.0+f where 0.0 ~ f s rr/4.0 If f < 2-- 14 tan(f) = f Trigonometric Routines 5-31 Otherwise tan(f) = fe R(f2) R(f2) = (pO+f2e(p1+f2ep2))/(qO+f2e(q1+f2)) pO = 62.604 p1 = -6.9716 p2 = 6.7309 qO = pO q1 = -27.839 Then, TAN(x) can be derived if L is an integer and n has the values shown in the following table. Deriving TAN(x) Value of n Low-order two bits of n TAN(x) 4L 00 sgn(x) etan(f) 4L+1 01 sgn(x) e (l/tan(f» 4L+2 10 sgn(x) e (-l/tan(f» 4L+3 11 sgn(x) e_tan(f) Reference Coefficients are derived flom those given in Cody and Waite, Software Manual for Elementary Functions (Englewood Cliffs, N.J.: Prentice Hall, 1980) for machines with 25-32 bit precision. Error Conditions If the absolute value of the argument is greater than 226 e1r/2, the following message is issued and the result is set to 0.0. TAN: ABS(arg) too large; result = zero 5-32 TOPS-10/TOPS-20 Common Math Library Reference Manual COTAN Description The COTAN routine calculates the single-precision, floating-point cotangent of the single-precision, floating-point angle given in radians as the argument. That is: COTAN(x) = cot(x) Routines Called COTAN calls the MTHERR routine. Type of Argument The argument must be a single-precision, floating-point value less than or equal to 226 • rr/2 and greater than 2- 126 • (1/2+2- 27 ). Type of Result The result returned is a single-precision, floating-point value; it may be any such value. Accuracy of Result test interval: -10.000 through 201.06 MRE: 2.42xlO-8 (25.3 bits) RMS: 5.29x10-9 (27.5 bits) LSB error distribution: -2 -1 o 0% 18% 66% +1 16% +2 0% Algorithm Used COTAN (x) is calculated as follows. If Ixl > 226 • 7r/2 COTAN(x) = 0.0 If Ixl < 2- 126 • (1/2+2- 27 ) COTAN(x) = +machine infinity Otherwise, the identities: tan( 7r/2.0-g) = 1.0/tan(g) tan(n ·7r+h) = tan(h) where -7r/2.0 < h ~ 7r/2.0 tan( -x) = -tan(x) cot(x) = 1.0/tan(x) cot( -x) = -cot(x) are used to reduce COTAN(x) to a problem with -rr/2.0 < x ~ 7r/2.0 Trigonometric Routines 5-33 Then nand f are defined so that: x = n ·1r/4.0+f where 0.0 ~ f ~ 1r/4.0 If f < 2- 14 tan(f) = f Otherwise tan(f) = f· R(f2) R(f2) = (pO+f2. (pI +f 2• p2) )/(qO+f2• (ql +f2» pO = 62.604 pI = -6.9716 p2 = 6.7309 qO = pO ql = -27.839 Then COTAN(x) can be derived if L is an integer and n has the value shown in the following table. Deriving COTAN(x) Value of n Low-order two bits of n COTAN(x) 4L 00 sgn(x) • (l/tan(f)) 4L+1 01 sgn(x) ·tan(f) 4L+2 10 sgn(x) • -tan(f) 4L+3 11 sgn(x) • -(l/tan(f)) Reference Coefficients are derived from those given in Cody and Waite, Software Manual for Elementary Functions (Englewood Cliffs, N.J.: Prentice Hall, 1980) for machines with 25-32 bit precision. Error Conditions 1. If the absolute value of the argument is less than 2- 126 • (l/2+Z-27), the following message is issued and the result is set to +machine infinity. COT AN: result overflow 2. If the absolute value of the argument is greater than 226 ·1r/2, the following message is issued and the result is set to 0.0. COTAN: ABS(arg) too large; result = zero 5-34 TOPS-10/TOPS-20 Common Math Library Reference Manual DTAN Description The DTAN routine calculates the double-precision, D-floating-point tangent of the double-precision, D-floating-point angle given in radians as the argument. That is: DTAN(x) = tan(x) Routines Called DTAN calls the MTHERR routine. Type of Argument The argument must be a double-precision, D-floating-point value less than or equal to 231 ·7r/2. Type of Result The result returned is a double-precision, D-floating-point value; it may be any such value. Accuracy of Result test interval: -10.000 through 201.06 MRE: 9.60x1o-- 19 (59.9 bits) RMS: 2.08x10- 19 (62.1 bits) LSB error distribution: -2 1% -1 o 18% 55% +1 22% +2 3% Algorithm Used DTAN(x) is calculated as follows. If Ixl > 231 ·7r/2 DTAN(x) = 0.0 Otherwise, the identities: tan( 7r/2.0-g) = 1.0/tan(g) tan(n ·7r+h) = tan(h) where -7r/2.0 < h S 7r/2.0 tan( -x) = -tan(x) are used to reduce DTAN(x) to a problem with -7r/2.0 < x S 7r/2.0 Then nand f are defined so that: x = n e7r/2.0+f where -7r/4.0 S f S 7r/4.0 If f < 2-31 tan(f) = f Trigonometric Routines 5-35 Otherwise tan(f) = R(f) R(f) = ««(xp4-g+xp3) -g+xp2)-g+xp1) -g) -f+f)/ « «q4 -g+q3) -g+q2) -g+q1) -g+l.O) g = f-f xp1 = -.1372889460941120802 xp2 = .3925934686364577602 -10- 2 xp3 = -.2882482747560198194 -10- 4 xp4 = .2927308283322907641-10- 7 q1 = -.4706222794274454135 q2 = .2746669449551304872-10- 1 q3 = - .4030063705745304384 -10--:3 q 4 = .1312960309685759549 -10- 5 If n is even DTAN(x) = tan(f) If n is odd DTAN(x) = -1/tan(f) Reference Coefficients are derived from those given in Cody and Waite, Software Manual for Elementary Functions, (Englewood Cliffs, N .~J.: Prentice Hall, 1980) for machines with 25-32 bit precision. Error Conditions If the absolute value of the argument is greater than ~1 - 7r/2, the following message is issued and the result is set to 0.0. DTAN: ABS(arg) too large; result = zero 5-36 TOPS-10/TOPS-20 Common Math Library Reference Manual DCOTAN Description The DCOTAN routine calculates the double-precision, D-floating-point cotangent of the double-precision, D-floating-point angle given in radians as the argument. That is: DCOTAN(x) = cot(x) Routines Called DCOTAN calls the MTHERR routine. Type of Argument The argument must be a double-precision, D-floating-point value less than or equal to 2'f3 1 -7r/2 and greater than 2- 127 -(1+2-61 ). Type of Result The result returned is a double-precision, D-floating-point value; it may be any such value. Accuracy of Result test interval: -10.000 through 201.06 MRE: 9.09x10- 19 (59.9 bits) RMS: 2.08x10- 19 (62.1 bits) LSB error distribution: -2 2% -1 o 23% 55% +1 19% +2 1% Algorithm Used DCOTAN(x) is calculated as follows. If Ixl > 231 _7r/2 DCOTAN(x) = 0.0 If Ixl < 2- 127 - (1 +2- 61 ) DCOTAN(x) = +machine infinity Otherwise, the identities: tan( 7r/2.0-g) = 1.0/tan(g) tan(n -7r+h) = tan(h) where -7r/2.0 < h $; 7r/2.0 tan( -x) = -tan(x) cot(x) = 1.0/tan(x) cot( -x) = -cot(x) are used to reduce DCOTAN(x) to a problem with -7r/2.0 < x $; 7r/2.0 Then nand f are defined so that: x = n -7r/2.0+f where -7r/4.0 $; f $; 7r/4.0 If f < 2- 31 tan(f) = f Trigonometric Routines 5-37 Otherwise tan(f) = R(f) R(f) = « (xp4 -g+xp3) -g+xp2) -g+x'p1) -g) -f+f)/ ««q4-g+q3) -g+q2) -g+q1) -g+1.0) g = f-f xp1 = -.1372889460941120802 xp2 = .3925934686364577602 -10-- 2 xp3 = -.2882482747560198194 -10--4 xp4 = .2927308283322907641-10-7 q1 = -.4706222794274454135 q2 = .2746669449551304872-10--1 q3 = -.4030063705745304384-10-3 q4 = .1312960309685759549-10-5 « If n is even DCOTAN(x) = l/tan(f) If n is odd DCOTAN(x) = -tan(f) References Coefficients are derived from those given in Cody and Waite, Software Manual for Elementary Functions, (Englewood Cliffs, N.J.: Prentice Hall, 1980) for machines with 25-32 bit precision. Error Conditions 1. If the absolute value of the argument is greater than 231 _1r/2, the following message is issued and the result is set to 0.0. DCOT AN: ABS(arg) too large; result = zero 2. If the absolute value of the argument is less than 2- 127 - (1 + (Z-61 », the following message is issued and the result is set to +machine infinity. DCOT AN: Result overflow 5-38 TOPS-10/TOPS-20 Common Math Library Reference Manual GTAN Description The GTAN routine calculates the double-precision, G-floating-point tangent of the double-precision, G-floating-point angle given in radians as the argument. That is: GTAN(x) = tan(x) Routines Called GTAN calls the MTHERR routine. Type of Argument The argument must be a double-precision, G-floating-point value less than or equal to 22ge 7r/2. Type of Result The result returned is a double-precision, G-floating-point value; it may be any such value. Accuracy of Result test interval: -10.000 through 201.06 MRE: 5.95x10- 18 (57.2 bits) RMS: 1.43x10-18 (59.3 bits) LSB error distribution: -2 1% -1 o 20% 60% +1 18% +2 0% Algorithm Used GTAN(x) is calculated as follows. If Ixl > 22ge 7r/2 GTAN(x) = 0.0 Otherwise, the identities: tan( 7r/2.0-g) = 1.0/tan(g) tan(n e7r+h) = tan(h) where -7r/2.0 < h ::; 7r/2.0 tan( -x) = -tan(x) are used to reduce GTAN(x) to a problem with -7r/2.0 < x ::; 7r/2.0 Then nand f are defined so that: x = n e7r/2.0+f where -7r/4.0 ~ f ~ 7r/4.0 If f < 2-30 tan(f) = f Trigonometric Routi nes 5-39 Otherwise tan(f) = R(f) R(f) = ««(xp4-g+xp3) -g+xp2)-g+xp1) -g) -f+f)/ ««q4-g+q3) -g+q2) -g+q1) -g+1.0) g = f-f xp1 = -.1372889460941120802 xp2 = .3925934686364577602 -10- 2 xp3 = -.2882482747560198194-10-4 xp4 = .2927308283322907641-10-7 q1 = -.4706222794274454135 q2 = .2746669449551304872-10- 1 q3 = -.4030063705745304384-10-3 q4 = .1312960309685759549-10-5 If n is even GTAN(x) = tan(f) If n is odd GTAN(x) = -l/tan(f) Reference Coefficients are derived from those given in Cody and Waite, Software Manual for the Elementary Functions, (Englewood, N.J.: Prentice Hall, 1980) for machines with 25-32 bit precision. Error Conditions If the absolute value of the argument is greater than 229 _1r/2, the following message is issued and the result is set to 0.0. GT AN: ABS(arg) too large; result = zero 5-40 TOPS-10/TOPS-20 Common Math Library Reference Manual GCOTAN Description The GCOTAN routine calculates the double-precision, G-floating-point cotangent of the double-precision, G-floating-point angle given in radians as the argument. That is: GCOTAN(x) = cot(x) Routines Called GCOTAN calls the MTHERR routine. Type of Argument The argument must be a double-precision, G-floating-point value less than or equal to 229 _1(/2 and greater than 2-1023_(1+2-58). Type of Result The result returned is a double-precision, G-floating-point value; it may be any such value. Accuracy of Result test interval: MRE: -10.000 through 201.06 6.46x10- 18 (57.1 bits) RMS: 1.43x10-18 (59.3 bits) LSB error distribution: -2 -1 o 1% 18% 60% +1 20% +2 1% Algorithm Used GCOTAN (x) is calculated as follows. If Ixl > 229 -1(/2 GCOTAN(x) = 0.0 If Ix I < 2- 1023 - (1 + 2-58 ) GCOTAN (x) = + machine infinity Otherwise, the identities tan( 1(/2.0-g) = 1.0/tan(g) tan(n· 1(+h) = tan(h) where -1(/2.0 < h:s; 1(/2.0 tan( -x) = -tan(x) cot(x) = 1.0/tan(x) cot( -x) = -cot(x) are used to reduce GCOTAN(x) to a problem with -1(/2.0 < x :s; 1(/2.0 Then nand f are defined so that: x = n -1(/2.0+f where -1(/4.0 :s; f:s; 1(/4.0 If f <2- 30 tan(f) = f Trigonometric Routines 5-41 Otherwise tan(f) = R(f) R(f) = (((((xp4-g+xp3)-g+xp2)-g+xp1)-g)-f+f)/ (( ((q4-g+q3)" g+q2) -g+q1) -g+1.0) g = f-f xp1 = -.1372889460941120802 xp2 = .3925934686364577602 -10- 2 xp3 = -.2882482747560198194-10- 4 xp4 = .2927308283322907641-10-7 q1 = -.4706222794274454135 q2 = .2746669449551304872-10-- 1 q3 = -.4030063705745304384-10-3 q4 = .1312960309685759549-10-5 If n is even GCOTAN (x) = l/tan(f) If n is odd GCOTAN(x) = -tan(f) Reference Coefficients are derived from those given in Cody and Waite, Software Manual for Elementary Functions, (Englewood Cliffs, N.J.: Prentice Hall, 1980) for machines with 25-32 bit precision. Error Conditions 1. If the absolute value of the argument is greater than 229 ·1r/2, the following message is issued and the result is set to 0.0. Gcor AN:ABS(arg) to large; result = zero 2. If the absolute value of the argument is less than 2- 1023 • (1+2- 58 ), the following message is issued and the result is set to + machine infinity. Gcor AN: Result overflow 5-42 TOPS-10/TOPS-20 Common Math Library Reference Manual Chapter 6 Inverse Trigonometric Routines ASIN Description The ASIN routine calculates, in radians, the single-precision, floating-point arc sine of its single-precision, floating-point argument. That is: ASIN(x) = sin- 1 (x) Routines Called ASIN calls the SQRT and MTHERR routines. Type of Argument The argument must be a single-precision, floating-point value in the range -1.0 to 1.0. Type of Result The result returned is a single-precision, floating-point value in the range -7r/2 to 7r/2. Accuracy of Result test interval: 0.00000 through 1.0000 MRE: 2.56xlO-8 (25.2 bits) RMS: 5.34xlO-9 (27.5 bits) LSB error distribution: -2 0% o -1 10% 83% +1 7% +2 0% Algorithm Used ASIN(x) is calculated as follows. Let R(z) = ze(pO+ze(pl+zep2))/(qO+ze(ql+z)) pO = .564915737 pI = -.409490163 p2 = 1.93496723xl0-2 qO = 3.38949412 ql = -3.98220081 Let s = y+y eR(z) Then, the following table gives the value of ASIN(x) depending on the values of x, z, and y. range of x z y ASIN(x) -1.0 to -.5 (1+x)/2 -2$ -(1r/2+s) -.5 to 0.0 2 -x -8 0.0 to .5 x 2 x 8 .5 to 1.0 (1-x)/2 -2$ 1r/2+s X Error Conditions If the absolute value of the argument is greater than 1.0, the following message is issued and the result is set to +machine infinity. ASIN: ABS(arg) greater than 1.0; result = +infinity Inverse Trigonometric Routines 6-3 ACOS Description The ACOS routine calculates, in radians, the single-precision, floating-point arc cosine of its single-precision, floating-point argument. That is: ACOS(x) = cos- 1 (x) Routines Called ACOS calls the SQRT and MTHERR routines. Type of Argument The argument must be a single-precision, floating-point value in the range -1.0 to 1.0. Type of Result The result returned is a single-precision, floating-point value in the range 0.0 to 7r. Accuracy of Result test interval: 0.00000 through 1.0000 MRE: 1.5fixl0-8 (25.9 bits) RMS: 3.76xl0-9 (28.0 hits) LSB error distribution: -2 -1 o 0% 8% 83% +1 9% +2 0% Algorithm Used ACOS(x) is calculated as follows. Let R(z) = z-(pO+z-(pl+z-p2))/(qO+z-(ql+z)) pO = .564915737 pI = -.409490163 p2 = .93496723xl0-2 qO = 3.38949412 ql = -3.98220081 Let s = y+y·R(z) Then, the following table gives the values of ACOS(x) depending on the values of x, z, and y. range of x z y ACOS(x) -1.0 to-.5 (1+x)/2 -2vz 1\"+8 -.5 to 0.0 2 -x 1\"/2+8 x 1\"/2 -8 -2$' -8 x 2 0.0 to .5 x .5 to 1.0 (l-x)/2 Error Conditions If the absolute value of the argument is greater than 1.0, the following mes- sage is issued and the result is set to +machine infinity. ACOS: ABS(arg) greater than 1.0; result = +infinity 6-4 TOPS-10/TOPS-20 Common Math Library Reference Manual DASIN Description The DASIN routine calculates, in radians, the double-precision, D-floatingpoint arc sine of its double-precision, D-floating-point argument. That is: DASIN(x) = sin- 1(x) Routines Called DASIN calls the DSQRT and MTHERR routines. Type of Argument The argument must be a double-precision, D-floating-point value in the range -1.0 to 1.0. Type of Result The result returned is a double-precision, D-floating-point value in the range -7r/2 to 7r/2. Accuracy of Result test interval: 0.00000 through 1.0000 MRE: 8.96x10- 19 (60.0 bits) RMS: 1.88x10-19 (62.2 bits) LSB error distribution: -2 1% -1 25% 0 69% +1 5% +2 0% Algorithm Used DASIN(x) is calculated as follows. Let R(g) = (g e(rp1+g e(rp2+g e(rp3+g e(rp4+g-rpfi)))))/ (qO+g e(q1+ge(q2+ge (q3+g e(q4+g))))) rp1 = -.27368494524164255994x10 2 rp2 = .57208227877891731407x10 2 rp3 = -.39688862997504877339x10 2 rp4 = .10152522233806463645x102 rp5 = -.69674573447350646411 qO = -.16421096714498560795xl03 ql = .41714430248260412556xl03 q2 = -.38186303361750149284xl03 q3 = .15095270841030604719xl0 3 q4 = -.23823859153670238830xl0 2 Let s = y+yeR(g) Then, the following table gives the values of DASIN(x) depending on the values of x, Z, and y. Inverse Trigonometric Routines 6-5 range of x z y DASIN(x) -1.0 to -.5 (l+x)/2 -2$ -(1r/2+8) -.5 to 0.0 x 2 -x -8 0.0 to .5 x 2 x 8 .5 to 1.0 (l-x)/2 -2VZ 1r/2+8 Error Conditions If the absolute value of the argument is greater than 1.0, the following message is issued and the result is set to +machine infinity. DASIN: ABS(arg) greater than 1.0; result = +infinlty 6-6 TOPS-10/TOPS-20 Common Math Library Reference Manual DACOS Description The DACOS routine calculates, in radians, the double-precision, D-floatingpoint arc cosine of its double-precision, D-floating-point argument. That is: DACOS(x) = cos- 1(x) Routines Called DACOS calls the DSQRT and MTHERR routines. Type of Argument The argument must be a double-precision, D-floating-point value in the range -1.0 to 1.0. Type of Result The result returned is a double-precision, D-floating-pointvalue in the range 0.0 to 11". Accuracy of Result test interval: 0.00000 through 1.0000 MRE: 4.48x10- 19 (61.0 bits) RMS: 1.25x10- 19 (62.8 bits) LSB error distribution: -2 0% -1 19% 0 75% +1 6% +2 0% Algorithm Used DACOS(x) is calculated as follows. Let R(g) = (ge(rp1+ge(rp2+ge(rp3+g e(rp4+gerp5)))))/ (qO+ge (q1+ge (q2+ge (q3+g e(q4+g))))) rp1 = -.27368494524164255994x102 rp2 = .57208227877891731407x102 rp3 = -.39688862997504877339x102 rp4 = .10152f)22233806463645x102 rp5 = -.69674573447350646411 qO = -.16421096714498f)6079f)x103 q1 = .41714430248260412556x103 q2 = -.381863033617f)0149284x103 q3 = .1f)09f)270841030604719x103 q4 = -.238238f)91f)3670238830x102 Let s = y+y e R(g) Then, the following table gives the values of DACOS(x) depending on the values of x, Z, and y. Inverse Trigonometric Routines 6-7 range of x z y ACOS(x) -1.0 to -.5 (1+x)/2 -2yz 7r+8 -.5 to 0.0 x 2 -x 1r/2+8 0.0 to .5 x 2 x 1r/2-8 .5 to 1.0 (1-x)/2 -2yz -8 Error Conditions If the absolute value of the argument is greater than 1.0, the following mes- sage is issued and the result is set to +machine infinity. DACOS: ABS(arg) greater than 1.0; result = +infinity 6-8 TOPS-10/TOPS-20 Common Math Library Reference Manual GASIN Description The GASIN routine calculates, in radians, the double-precision, G-floatingpoint arc sine of its double-precision, G-floating-point argument. That is: GASIN (x) = sin- 1(x) Routines Called GASIN calls the GSQRT and MTHERR routines. Type of Argument The argument must be a double-precision, G-floating-point value in the range -1.0 to 1.0. Type of Result The result returned is a double-precision, G-floating-point value in the range -7r/2 to 7r/2. Accuracy of Result test interval: 0.00000 through 1.0000 MRE: 6.69x10- 18 (57.1 bits) RMS: 1.54x10- 18 (59.2 bits LSB error distribution: -2 1% -1 26% 0 72% +1 2% +2 0% Algorithm Used GASIN(x) is calculated as follows. Let R(g) = (ge(rp1+ge(rp2+ge(rp3+g e(rp4+gerp5)))))/ (qO+ge (q1+ge (q2+g e(q3+g e(q4+g))))) rp1 = -.27368494f)2416425f)994x102 rp2 = .57208227877891731407x102 rp3 = -.39688862997f)04877339x102 rp4 = .101f)2f)22233806463645x102 rp5 = ~.69674573447350646411 qO = -.16421096714498f)60795x103 q1 = ,417144302482604125f)6x103 q2 = -.381863033617fi0149284x103 q3 = .1f)09f)270841030604719x103 q4 = -.238238f)91f)3670238830x102 Let s = y+y eR(g) Then, the following table gives the value of GASIN(x) depending on the values of x, z, and y. Inverse Trigonometric Routines 8-9 range of x z y GASIN(x) ~1.0 to -.5 (1+x)/2 -2VZ -( 11"/2+8) -.5 to 0.0 2 X -x -8 0.0 to .5 x 2 x 8 .5 to 1.0 (1- x)/2 -2VZ 11"/2+8 Error Conditions If the absolute value of the argument is greater than 1.0, the following message is issued and the result is set to +machine infinity. GASIN: ABS(arg) greater than 1.0; result = +infinity 6-10 TOPS-10/TOPS-20 Common Math Library Reference Manual GACOS Description The GACOS routine calculates, in radians, the double-precision, G-floatingpoint arc cosine of its double-precision, G-floating-point argument. That is: GACOS(x) = cos- 1 (x) Routines Called GACOS calls the GSQRT and MTHERR routines. Type of Argument The argument must be a double-precision, G-floating-point value in the range -1.0 to 1.0. Type of Result The result returned is a double-precision, G-floating-point value in the range 0.0 to 1T'. Accuracy of Result test interval: 0.00000 through 1.0000 MRE: 4.18x10- 18 (57.7 bits) RMS: 1.03x10- 18 (59.8 bits) LSB error distribution: -2 0% 1 14% 0 72% 1 + 15% 2 + 0% Algorithm Used GACOS(x) is calculated as follows. Let R(g) = (ge (rp1 +ge (rp2+ge (rp3+g e(rp4+gerp5»»)/ (qO+g· (q1+ge (q2+ge (q3+g e(q4+g»») rp1 = -.27368494524164255994x1(f rp2 = .57208227877891731407x1(f rp3 = -.39688862997504877339x1(f rp4 = .10152522233806463645x1(f rp5 = -.69674573447350646411 qO = -.16421096714498560795x1Q3 q1 = .41714430248260412556x1Q3 q2 = -.38186303361750149284x1Q3 q3 = .15095270841030604719x1(i3 q4 = -.23823859153670238830x1(f Let s = y+yeR(g) Then the following table gives the value of GACOS(x) depending on the values of x, z, and y. Inverse Trigonometric Routines 6-11 y GACOS(x) (l+x)/2 -2$ 1r+S 2 -x 1r/2+s 0.0 to .5 2 x x 1r/2-s .5 to 1.0 (l-x)/2 -2$ -s range of x z -1.0 to -.5 -.5 to 0.0 X Error Conditions If the absolute value of the argument is greater than 1.0, the following mes- sage is issued and the result is set to machine infinity. GACOS: ABS(arg) greater than 1.0; result = +infinity 6-12 TOPS-10/TOPS-20 Common Math Library Reference Manual ATAN Description The ATAN routine calculates, in radians, the single-precision, floating-point arc tangent of its single-precision, floating-point argument. That is: ATAN(x) = tan- 1(x) Routines Called None Type of Argument The argument must be a single-precision, floating-point value; it can be any such value. Type of Result The result returned is a single-precision, floating-point value in the range -7r/2 to 7r/2. Accuracy of Result test interval: -80.000 through 80.000 MRE: 8.07x10-9 (26.9 bits) RMS: 2.99x10-9 (28.3 bits) LSB error distribution: -2 0% -1 o 1% 98% +1 1% +2 0% Algorithm Used ATAN(x) is calculated as follows. If x < 0.0 ATAN(x) = -ATAN(lxl) If x > 0.0 ATAN(x) = tan-1(XHI)+tan-1(z) z = (x-XHI)/(1+x·XHI) XHI is chosen so that Izi ~ tan( 7r/32) tan- 1(XHI) is found by table lookup. It is stored as ATANHI and ATANLO to provide guard bits for improved accuracy. tan- 1(z) is evaluated by means of a polynomial approximation (see "Reference" below). If x < tan( 7r/32) z=x ATAN(x) = tan-1(z) If x > l/tan( 7r/32) z = l/x ATAN(x) = 7r/2-tan- 1 (z) If tan(7r/32) < x < l/tan( 7r/32) an appropriate XHI is obtained from a table. The table contains values for XHI for various ranges of x. Inverse Trigonometric Routines 6-13 Reference The polynomial approximation used in the algorithm is formula #4901 from Hart et aI., Computer Approximations, (New York, N.Y.: John Wiley and Sons, 1968). Error Conditions None 8-14 TOPS-10/TOPS-20 Common Math Library Reference Manual ATAN2 Description The ATAN2 routine calculates, in radians, the single-precision, floating-point polar angle for the two single-precision, floating-point coordinates of a point in the x-y plane that are included as the arguments. That is: ATAN2(y,x) = tan- 1(y/x) Routines Called ATAN2 calls the ATAN and MTHERR routines. Type of Arguments The arguments must be single-precision, floating-point values; they can be any such values provided both arguments are not zero. Type of Result The result returned is a single-precision, floating-point value in the range -1r to 1r. Accuracy of Result test interval: -80.000 through 1.0000 for x -80.000 through 1.0000 for y MRE: 1.46x10-8 (26.0 bits) RMS: 3.08x10-9 (28.3 bits) LSB error distribution: -2 0% -1 1% o 98% +1 1% +2 0% Algorithm Used ATAN2 (y,x) is calculated as follows. Let u = Iyl and v = Ixl and compute tan-1(u,v) Then find ATAN2(y,x) based on the signs of y and x as follows. x y ATAN2(y,x) + + tan-1(u,v) -tan-1(u, v) + + -(tan-l( u, V)-1I") tan-l (U,V)-1I" Inverse Trigonometric Routines 6-15 The reduced argument for ATAN2 is: z = (u/v-XHI)/(1+u/v· XHI) This is rewritten as: z = (u-v·XHI)/(v+u·XHl) The numerator is calculated to be: u-v·XHI = u-VHI·XHI-VLO·XHI v = VHI+VLO VHI has, at most, 27 significant bits VLO has, at most, 35 significant bits XHI is tabulated with, at most, 13 significant bits This guarantees that the numerator of z is calculated exactly. Error Conditions 1. If both arguments are 0.0, the following message is issued and the result is set to 0.0. AT AN2: Both arguments are zero, result = zero 2. If y/x underflows and x is greater than 0.0, the following message is issued and the result is set to 0.0. ATAN2: Result underflow 6-16 TOPS-10/TOPS-20 Common Math Library Reference Manual DATAN Description The DATAN routine calculates, in radians, the double-precision D-floatingpoint arc tangent of its double-precision, D-floating-point argument. That is: DATAN(x) = tan- 1 (x) Routines CaUed None Type of Argument The argument must be a double-precision, D-floating-point value; it can be any such value. Type of Result The result returned is a double-precision, D-floating-point value in the range -7r/2 to 7r/2. Accuracy of Result test interval: -80.000 through 80.000 MRE: 3.40x10- 19 (61.3 bits) RMS: 9.37x10- 20 (63.2 bits) LSB error distribution: -2 0% -1 1% o +1 +2 94% 5% 0% Algorithm Used DATAN(x) is calculated as follows. If x < 0.0 DATAN(x) = -DATAN(lxl) If x> 0.0 DATAN(x) = tan- 1(XHI)+tan- 1 (z) z = (x-XHI)/(1+x·XHI) XHI is chosen so that Izl $; tan( 7r/32) tan- 1 (XHI) is found by table lookup. It is stored as ATANHI and ATANLO to provide guard bits for improved accuracy. tan- 1(z) is evaluated by means of a polynomial approximation (see"Reference" below). If x < tan( 7r/32) z= x DATAN(x) = tan- 1 (z) If x > 1/tan( 7r/32) z = 1/x DATAN(x) = 7r/2-tan- 1 (z) If tan( 7r/32) < x < 1/tan( 7r/32) an appropriate XHI is obtained from a table. The table contains values for XHI for various ranges of x. Inverse Trigonometric Routines 6-17 Reference The polynomial approximation used in the algorithm is formula #4904 from Hart et aI., Computer Approximation~, (New York, N.Y.: John Wiley and Sons, 1968). Error Conditions None 6-18 TOPS-10/TOPS-20 Common Math Library Reference Manual DATAN2 Description The DATAN2 routine calculates, in radians, the double-precision, D-floatingpoint polar angle for the two double-precision, D-floating-point coordinates of a point in the x-y plane that are included as the arguments. That is: DATAN2(y,x) = tan- 1(y/x) Routines Called DATAN2 calls the DATAN and M1'HERR routines. Type of Arguments The arguments must be double-precision, D-floating-point values; they can be any such values provided both arguments are not zero. Type of Result The result returned is a double-precision, D-floating-point value in the range -1r to 1r. Accuracy of Result test interval: -80.000 through 1.0000 for x -80.000 through 1.0000 for y MRE: 5.27x10- 19 (60.7 bits) RMS: 9.09x10-9 (63.3 bits) LSB error distribution: -2 0% -1 o +1 1% 97% 2% +2 0% Algorithm Used DATAN2(y,x) is calculated as follows. Let u = Iyl and v = Ixl and compute tan-I (u/v) Then find DATAN2(y,x) based on the signs of y and x as follows. x y DATAN2(y,x) + + tan-l (u/v) -tan-l (u/v) + + -(tan-I (u/v)-1I") tan -I (u/v )-11" Inverse Trigonometric Routines 6-19 The reduced argument for DATAN2 is: z = (u/v-XHI)/(1+u/v· XHI) This is rewritten as: z = (u-v·XHI)/(v+u·XHI) The numerator is calculated to be: u-v·XHI = u-VHI·XHI-VLO·XHI v = VHI+VLO VHI has, at most, 27 significant bits VLO has, at rnost, 35 significant bits XHI is tabulated with, at most, 13 significant bits This guarantees that the numerator of z is calculated exactly. Error Conditions 1. If both arguments are 0.0, the following message is issued and the result is set to 0.0. DA T AN2: Both arguments are zero, result = zero 2. If y/x underflows and x is greater than 0.0, the following message is issued and the result is set to 0.0. OAT AN2: Result underflow 6-20 TOPS-10/TOPS-20 Common Math Library Reference Manual GATAN Description The GATAN routine calculates, in radians, the double-precision, G-floatingpoint arc tangent of its double-precision, G-floating-point argument. That is: GATAN(x) = tan-1(x) Routines Called None Type of Argument The argument must be a double-precision, G-floating-point value; it can be any such value. Type of Result The result returned is a double-precision, G-floating-point value in the range - 1r/2 to 1r/2. Accuracy of Result test interval: -80.000 through 80.000 MRE: 2.04x10- 18 (58.8 bits) RMS: 7.03x10- 19 (60.3 bits) LSB error distribution: -2 -1 o +1 +2 0% 1% 97% 2% 0% Algorithm Used GATAN(x) is calculated as follows. If x < 0.0 GATAN(x) = -GATAN(lxl) If x> 0.0 GATAN(x) = tan- 1(XHI)+tan- 1 (z) z = (x-XHI)/(1+x·XHI) XHI is chosen so that Izi ~ tan( 1r/32) tan- 1(XHI) is found by table lookup. It is stored as ATANHI and ATANLO to provide guard bits for improved accuracy. tan- 1 (z) is evaluated by means of a polynomial approximation (see "Reference" below). If x < tan( 1r/32) z=x GATAN(x) = tan- 1 (z) If x > tan( 1r/32) z = l/x GATAN(x) = 1r/2-tan- 1 (z) If tan( 1r/32) < x < l/tan( 1r/32) an appropriate XiiI is obtained from a table. The table contains values for XHI for various ranges of x. Inverse Trigonometric Routines 6-21 Reference The polynomial approxinlation used in the algorithm is formula 4904 from Hart et ai., Computer Approximations, (New York, N.Y.: John Wiley and Sons, 1968). Error Conditions None 6-22 TOPS-10/TOPS-20 Common Math Library Reference Manual GATAN2 Description The GATAN2 routine calculates, in radians, the double-precision, G-floatingpoint polar angle for the two double-precision, G-floating-point coordinates of a point in the x-y plane that are included as the arguments. That is: GATAN2(y,x) = tan-l(y/x) Routines Called GATAN2 calls the GATAN and MTHERR routines. Type of Arguments The arguments must be double-precision, G-floating-point values; they can be any such values provided both arguments are not zero. Type of Result· The result returned is a double-precision, G-floating-point value in the range to 7r. - 7r Accuracy of Result test interval: -80.000 through 1.0000 for x -80.000 through 1.0000 for y MRE: 3.28x10-- 18 (58.1 bits) RMS: 7.15xlO- 19 (60.3 bits) LSB error distribution: -2 -1 a 0% 1% 98% +1 2% +2 0% Algorithm Used GATAN2(y,x) is calculated as follows. Let u = Iyl and v = Ixl and compute tan-leu/v) Then find GATAN2(y,x) based on the signs of y and x as follows. x y GATAN2(y,x) + + tan-- l ( u/v ) -tan-l (u/v) + + -( tan-I (u/v )-7r) tan-l (U/V)-7r Inverse Trigonometric Routines 6-23 The reduced argument for GATAN2 is: z = (u/v-XHI)/(I+u/v·XHI) This is rewritten as: z = (u-v·XHI)/(v+u·XHI) The numerator is calculated to be: u-v·XHI = u-VHI·XHI-VLO·XHI v = VHI+VLO VHI has, at most, 27 significant bits VLO has, at most, 35 significant bits XHI is tabulated with, at most, 13 significant bits This guarantees that the numerator of z is calculated exactly. Error Conditions 1. If both arguments are 0.0, the following message is issued and the result is set to 0.0. GAT AN2: Both arguments are zero, result = zero 2. If y/x underflows and x is greater than 0.0, the following message is issued and the result is set to 0.0. GAT AN2: Result underflow 6-24 TOPS-10/TOPS-20 Common Math Library Reference Manual Chapter 7 Hyperbolic Routines SINH Description The SINH routine calculates the single-precision, floating-point hyperbolic sine of its single-precision, floating-point argument. That is: SINH(x) = sinh(x) Routines Called SINH calls the EXP and MTHERR routines. Type of Argument The argument must be a single-precision, floating-point value in the range -88.722 to 88.722. Type of Result The result returned is a single-precision, floating-point value; it may be any such value. Accuracy of Result test interval: 0.00000 through 88.721 MRE: 2.61x1o--8 (25.2 bits) RMS: 4.24x10-9 (27.8 bits) LSB error distribution: -2 0% a -1 4% +1 11% 85% +2 0% Algorithm Used SINH(x) is calculated as follows. The table below gives the value of SINH(x) depending upon the range of values for Ixl. range of Ixl SINH(x) 0.0 to 2- 13 x 2- 13 to 1.0 x ·p4(x2 ) 1.0 to 9.7 = 14·1oge(2) (ex -e- X )/2· sgn(x) 9.7 to 88.03 = 127·1oge(2) eX/2·sgn(x) 88.03 to 88.722 = 128 ·loge(2) ex-loge(2) • sgn(x) 88.722 to infinity infinity ·sgn(x) If z = x2 p4(z) = 1+z·(cl+z·(c2+z·(c3+c4·z))) cl = 1.666666643xlO-1 c2 = 8.333352593xlO- 3 c3 = 1.983581245xlO-4 c4 = 2.818523951xlO-6 Error Conditions If the absolute value of the argument is greater than 88.722, the following message is issued and the result is set to ± machine infinity using the sign of the argurnent. SINH: Result overflOw , Hyperbolic Routines 7-3 COSH Description The COSH routine calculates the single-precision, floating-point hyperbolic cosine of its single-precision, floating-point argument. That is: COSH(x) = cosh(x) Routines Called COSH calls the EXP and MTHERR routines. Type of Argument The argument must be a single-precision, floating-point value in the range -88.722 to 88.722. Type of Result The result returned is a single-precision, floating-point value greater than or equal to 1.0. Accuracy of Result test interval: 0.00000 through 88.721 MRE: 2.12x10-8 (25.5 bits) RMS: 4.49x10-9 (27.7 bits) LSB error distribution: -2 -1 o 0% 4% 82% +1 14% +2 0% Algorithm Used COSH(x) is calculated as follows. The table below gives the value of COSH(x) depending upon the range of . values for Ixl. range of Ixl COSH(x) 0.0 to 2- 14 1.0 to 9.7 = 14-1oge(2) (ex +e- x )/2 9.7 to 88.03 = 127 -loge(2) 88.03 to 88.722 = 128 -loge(2) eX/2 ex-!oge(2) 88.722 to infinity infinity 2- 14 Error Conditions If the absolute value of the argument is greater than 88.722, the following message is issued and the result is set to ± machine infinity using the sign of the argument. COSH: Result overflow 7-4 TOPS-10/TOPS-20 Common Math Library Reference Manual DSINH Description The DSINH routine calculates the double-precision, D-floating-point hyperbolic sine of its double-precision, D-floating-point argument. That is: DSINH(x) = sinh(x) Routines Called DSINH calls the DEXP and MTHERR routines. Type of Argument The argument must be a double-precision, D-floating-point value in the range -88.722 to 88.722. Type of Result The result returned is a double-precision, D-floating-point value; it may be any such value. Accuracy of Result test interval: 0.00000 through 88.721 MRE: 6.82x10-8 (60.3 bits) RMS: 1.27x10-9 (62.8 bits) LSB error distribution: -2 0% -1 6% o +1 83% 11% +2 0% Algorithm Used DSINH(x) is calculated as follows. The table below gives the value of DSINH(x) depending upon the range of values for Ixl. range of Ixl DSINH(x) 0.0 to 2-a1 x 2- 31 to 1.0 1.0 to 22.0 = 32 ·loge(2) x+x·R(x 2 ) (ex -e-X)/2· sgn(x) 22.0 to 88.03 = 127 ·loge(2) eX /2 • sgn(x) 88.03 to 88.722 =;: 128 ·loge(2) ex-!oge(2) • sgn(x) 88.722 to infinity infinity • sgn(x) Hyperbolic Routines 7-5 If z = x2 R(z) = (rpO+z· (rp1+z· (rp2+z·rp3) »/(qO+z· (q1+z· (q2+z») rpO =.35181283430177117881x106 rp1 = .11563521196851768270x105 rp2 = .16375798202630751372x103 rp3 = .78966127417357099479 qO = -.21108770058106271242x107 q1 = .36162723109421836460x105 q2 = -.27773523119650701667x103 Error Conditions If the absolute value of the argument is greater than 88.722, the following message is issued and the result is set to ± machine infinity using the sign of the argument. DSINH: Result overflow 7-6 TOPS~10/TOPS-20 Common Math Library Reference Manual DCOSH Description The DCOSH routine calculates the double-precision, D-floating-point hyperbolic cosine of its double-precision, D-floating-point argument. That is: DCOSH(x) = cosh(x) Routines Called DCOSH calls the DEXP and MTHERR routines. Type of Argument The argument must be a double-precision, D-floating-point value in the range -88.722 to 88.722. Type of Result The result returned is a double-precision, D-floating-point value greater than or equal to 1.0. Accuracy of Result test interval: 0.00000 through 88.721 MRE: 5.90x10- 19 (60.6 bits) RMS: 1.34x10- 19 (62.7 bits) LSB error distribution: -2 -1 o +1 0% 5% 81% 14% +2 0% Algorithm Used DCOSH(x) is calculated as follows. The table below gives the value of DCOSH(x) depending upon the range of values for Ixl. range of Ixl DCOSH(x) 0.0 to 2- 32 1.0 to 22.0 =·32 -loge(2) (ex +e- x )/2 22.0 to 88.03 = 127 -loge(2) 88.03 to 88.722 = 128 -loge(2) eX/2 ex.-}oge(2) 88.722 to infinity infinity 2- 32 Error Conditions If the absolute value of the argument is greater than 88.722, the following message is issued and the result is set to ± machine infinity using the sign of the argument. DCOSH: Result overflow Hyperbolic Routines 7-7 GSINH Description The GSINH routine calculates the double-precision, G-floating-point hyperbolic sine of its double-precision, G-floating-point argument. That is: GSINH(x) = sinh(x) Routines Called GSINH calls the GEXP and MTHERR routines. Type of Argument The argument must be a double-precision, G-floating-point value in the range -709.782713 to 709.782713. Type of Result The result returned is a double-precision, G-floating-point value; it may be any such value. Accuracy of Result test interval: 0.00000 through 88.721 MRE: 6.40x1o- 18 (57.1 bits) RMS: 9.44x1o- 19 (59.9 bits) LSB error distribution: -2 -1 o 0% 3% 87% +1 10% +2 0% Algorithm Used GSINH(x) is calculated as follows. The table below gives the value of GSINH(x) depending upon the range of values for Ixl. range of Ixl GSINH(x) 0.0 to 2-30 x 2- 30 to 1.0 1.0 to 22.0 = 32 -loge(2) x+x -R(x2) (ex -e-X)/2 - sgn(x) 22.0 to 709.089565 eX /2 - sgn(~) 709.089565 to 709.782713 ex-loge(2) -sgn(x) 709.782713 to infinity infinity -sgn(x) 7-8 TOPS-10/TOPS-20 Common Math Library Reference Manual If z = X2 R(z) = (rpO+z -(rp1+z -(rp2+z -rp3)) )/(qO+z- (q1+z -(q2+z))) rpO = .35181283430177117881.106 rp1 = .11563521196851768270-1cr rp2 = .16375798202630751372-10.1 rp3 = .78966127417357099479 qO = -.21108770058106271242-107 q1 = .36162723109421836460-1cr q2 = -.27773523119650701667-1fr1 Error Conditions If the absolute value of the argument is greater than 709.782713, the following message is issued and the result is set to ± machine infinity, using the sign of the argument. GSINH: Result overflow Hyperbolic Routines 7-9 GCOSH Description The GCOSH routine calculates the double-precision, G-floating-point hyperbolic cosine of its double-precision, G-floating-point argument. That is: GCOSH(x) = cosh(x) Routines Called GCOSH calls the GEXP and MTHERR routines. Type of Argument The argument must be a double-precision, G-floating-point value in the range -709.782713 to 709.782713. Type of Result The result returned is a double-precision, G-floating-point value greater than or equal to 1.0. Accuracy of Result test interval: 0.00000 through 88.721 MRE: 4.84x10- 18 (57.5 bits) RMS: 1.00x10-18 (59.8 bits) LSB error distribution: -2 -1 o 0% 3% 84% +1 13% +2 0% Algorithm Used GCOSH(x) is calculated as follows. The table below gives the value of GCOSH(x) depending upon the range of values for Ixl. range of Ixl GCOSH(x) 0.0 to 2- 30 1.0 2- 30 to 22.0 = 32 -loge(2) (ex +e- X )/2 22.0 to 709.089565 eX/2 709.089565 to 709.782713 ex-loge(2) 709.782713 to infinity infinity Error Conditions If the absolute value of the argument is greater than 709.782713, the following message is issued and the result is set to ± machine infinity, using the sign of the argument. GCOSH: Result overflow 7-10 TOPS-10/TOPS-20 Common Math Library Reference Manual TANH Description The TANH routine calculates the single-precision, floating-point hyperbolic tangent of its single-precision, floating-point argument. That is: TANH(x) = tanh(x) Routines Called TANH calls the EXP routine. Type of Argument The argument must be a single-precision, floating-point value; it can be any such value. Type of Result The result returned is a single-precision, floating-point value in the range -1.0 to 1.0. Accuracy of Result test interval: 0.00000 through 90.000 MRE: 2.69x10-8 (25.1 bits) RMS: 5.53x10-9 (27.4 bits) LSB error distribution: -2 -1 o 0% 0% 79% +1 21% +2 0% Algorithm Used TANH(x) is calculated as follows. The table below gives the value of TANH(x) depending upon the range of values for Ixl. range of Ixl TANH(x) 0.0 to 2- 15 x 2- 15 to loge(3)/2 x+x - R(x 2 ) loge(3)/2 to 9.8479016 (1-2/( e2 -Ixl + 1» - sgn(x) 9.8479016 to infinity 1.0-sgn(x) If g = x 2 R(g) = g-(a+b-g)/(c+g) a = -.823772813 b = -.383101067x10- 2 C = 2.47131965 Error Conditions None Hyperbolic Routines 7-11 DTANH Description The DTANH routine calculates the double-precision, D-floating-point hyperbolic tangent of its double-precision, D-floating-point argument. That is: DTANH(x) = tanh(x) Routines Called DTANH calls the EXP routine. Type of Argument The argument must be a double-precision, D-floating-point value; it can be any such value. Type of Result The result returned is a double-precision, D-floating-point value in the range -1.0 to 1.0. Accuracy of Result test interval: MRE: RMS: LSB error distribution: 0.00000 through 90.000 7.17x1019 (60.3 bits) 1.75x1019 (62.3 bits) -2 0% -1 o 0% 70% +1 30% +2 0% Algorithm Used DTANH(x) is calculated as follows. The table below gives the value of DTANH(x) depending upon the range of values for Ixl. range of Ixl DTANH(x) 0.0 to 2- 32 -~. x 32 2- -y3' to loge(3)/2 x+x-R(x2 ) loge(3)/2 to 22.1807100 (l-2/(e2 -Ixl +1» -sgn(x) 22.1807100 to infinity 1.0 - sgn(x) If g = x 2 R(g) = g- (rpO+g- (rp1+rp2- g) )/(qO+g- (q1+g- (q2+g») rpO = -.161341190239962281x1Q4 rp1 = -.992259296722360833x1()2 rp2 = -.964374927772254698 qO = .484023570719886887x1Q4 q1 = .22337720718962312926x1Q4 q2 = .112744743805349493x1Q3 Error Conditions None 7-12 TOPS-10/TOPS-20 Common Math Library Reference Manual GTANH Description The GTANH routine calculates the double-precision, G-floating-point hyperbolic tangent of its double-precision, G-floating-point argument. That is: GTANH(x) = tanh(x) Routines Called GTANH calls the GEXP routine. Type of Argument The argument must be a double-precision, G-floating-point value; it can be any such value. Type of Result The result returned is a double-precision, G-floating-point value in the range -1.0 to 1.0. Accuracy of Result test interval: 0.00000 through 90.000 MRE: 6.44x10- 18 (57.1 bits) RMS: 1.33x10-18 (59.4 bits) LSB error distribution: -2 0% -1 o 0% 80% +1 20% +2 0% Algorithm Used GTANH(x) is calculated as follows. The table below gives the value of GTANH(x) depending upon the range of values for Ixl. range of Ixl GTANH(X) 0.0 to 2- 32 - v3 x 32 2- _y3" tologe(3)/2 x+x -R(x2 ) loge(3)/2 to 22.1807100 (l-2/(e2 -Ixl +1)) -sgn(x) 22.1807100 to infinity 1.0-sgn(x) If g = x 2 R(g) = g-(rpO+g-(rpl+rp2·g»/(qO+g-(q1+g·(q2+g») rpO = -.161341190239962281xlcr rpl = -.992259296722360833xl02 rp2 = -.964374927772254698 qO = .484023570719886887xlcr q1 = .22337720718962312926xl04 q2 = .112744743805349493xl()3 Error Conditions None Hyperbolic Routines 7-13 Chapter 8 Random Number Generating Routines RAN Description The RAN routine returns pseudo random numbers between 0.0 and 1.0, but not including 0.0 or 1.0. The period of the sequence is 2147483647; that is, the numbers repeat every 2147483647 calls. RAN uses a pure multiplicative congruential random number generator with prime modulus. The seed value can be supplied by the system or supplied by a call to the SETRAN subroutine. (See SETRAN, p. 8-6). Routines Called RAN does not call any routines; but you can call the SETRAN subroutine to provide a seed value and the SA VRAN subroutine (see SA VRAN, p. 8-7) to determine the last seed, used by RAN. Type of Argument The argument is a dummy value that is not used. Type of Result The result returned is a single-precision, floating-point value that is greater than 0.0 and less than 1.0. Accuracy of Result The independence of successive random numbers generated by multiplicative congruential methods can be measured by the spectral test. For this generator, with seed 630360016 and modulus 2147483647, the spectral test yields the following results. n mu{n) bits 2 2.446 .4766 3.715 4.944 .8183 15 3 4 5 6 9 8 6 5 mu(n) measures how densely n-tuples of random numbers cover an n-dimensional square. bits is the number of independent bits in successive n-tuples of numbers returned by RAN. For example, successive pairs of random numbers can be considered to be independent in their first 15 bits. The remaining 12 bits are not independent. Random Number Generating Routines 8-3 Algorithm Used RAN(n) is calculated as follows. Using a seed value supplied from a call to the SETRAN subroutine or the default seed value 524287(=2 19_1), the seed value is calculated by: RAN(n) = seed/231 , truncated On subsequent calls to RAN, a new seed is calculated from the previous seed value by: seed = seed -630360016 mod (231 _1) and the random number is then generated. References A full description of the spectral test is given in R.R. Coveyan and R.D. MacPherson, Journal of the ACM 14 (1967), pp. 100-119 and in D.E. Knuth, Seminumerical Algorithms (Reading, Mass.: Addison-Wesley, 1981), Section 3.3.4. Error Conditions None 8-4 TOPS-10/TOPS-20 Common Math Library Reference Manual RANS Description The RANS routine returns pseudo random numbers between 0.0 and 1.0, but not including 0.0 or 1.0. The period of the sequence 2484877906816; that is, the numbers repeat every 2484877906816 calls. RANS is based on the same multiplicative random number generator as RAN (p. 8-3). In addition, it shuffles the numbers using a 128-word table. Routines Called RAN8 calls the RAN and SA VRAN routines. Type of Argument The argurnent is a dumm.y value that is not used. Type of Result The result returned is a single-precision, floating··point value that. is greater than 0.0 and less than 1.0. Accuracy of Result Not applicable Algorithm Used RANS(n) is calculated as follows. On the initial reference to RAN8, RAN is called 128 times to generate 8 1 , 8 2, ... ,8 12R (uniform random deviates in (0,1») and a new seed Xo. Xo is obtained from a call to the 8AVRAN subroutine (see 8AVRAN, p.8-7) after 8 128 has been generated. Then: Xi; 1= 630360016·X j mod(2 31 _1) j = (xi~l mod(128)+1 s·J = X.I." 1/231 t = s·J RANS(n) = t Error Conditions None Random Number Generating Routines 8-5 SETRAN Description The SETRAN subroutine provides the internal integer seed value for the RAN routine. SETRAN is used to reset RAN to return the same sequence of random numbers again, or to set RAN to an arbitrary value (such as the time of day) so that it will return an entirely new sequence. Routines Called SETRAN does not call any routines; but you can call the SAVRAN subroutine to save and return the last seed value used by RAN. Type of Argument The argument must be an integer value in the range 0 to 231. If the argument is 0, the default seed value for RAN is used. Type of Result Not applicable Accuracy of Result Not applicable Algorithm Used SETRAN(n) is calculated as follows. Using the value supplied, SETRAN computes: seed = Iseedl mod (2147483647) Error Conditions None 8-6 TOPS-10/TOPS-20 Common Math Library Reference Manual SAVRAN Description The SAVRAN subroutine saves and returns the last seed used by the RAN routine. Routines Called None Type of Argument The argument must be an integer variable in which the seed value will be stored. Type of Result The result returned is an integer value between 1 and 2147483647. Accuracy of Result Not applicable . Algorithm Used Not applicable Error Conditions None Random Number Generating Routines 8-7 Chapter 9 Absolute Value Routines lABS Description The lABS routine returns the integer absolute value of its integer argument. That is: IABS(n) = Inl Routines Called None Type of Argument The argument must be an integer value; it can be any such value. Type of Result The result returned is an integer value greater than or equal to O. Accuracy of Result The result is exact. Algorithm Used IABS(n) is calculated as follows. If n ~ 0 ABS(n) = n If n < 0 ABS(n) = -n Error Conditions If the argument is the "most negative integer" (4000000000008 ), overflow oc- curs and the result is set to machine infinity. Absolute Value Routines 9-3 ASS Description The ABS routine returns the single-precision, floating-point absolute value of its single-precision, floating-point argument. That is: ABS(x) = Ixl Routines Called None Type of Argument The argument must be a single-precision, floating-point value; it can be any such value. Type of Result The result returned is a single-precision, floating-point value greater than or equal to 0.0. Accuracy of Result The result is exact. Algorithm Used ABS(x) is calculated as follows. If x ~ 0.0 ABS(x) = x If x < 0.0 ABS(x) = -x Error Conditions None 9-4 TOPS-10/TOPS-20 Common Math Library Reference Manual DABS Description The DABS routine returns the double-precision, D-floating-point absolute value of its double-precision, D-floating-point argument. That is: DABS(x) = Ixl Routines Called None Type of Argument The argument must be a double-precision, D-floating-point value; it can be any such value. Type of Result The result returned is a double-precision, D-floating-point value greater than or equal to 0.0. Accuracy of Result The result is exact. Algorithm Used DABS(x) is calculated as follows. If x ~ 0.0 DABS(x) = x If x < 0.0 DABS(x) =-x Error Conditions None Absolute Value Routines 9-5 GABS Description The GABS routine returns the double-precision, G-floating-point absolute value of its double-pl'ecision, G-floating-point argument. That is: GABS(x) = Ixl Routines Called None Type of Argument The argument must be a double-precision, G-floating-point value; it can be any such value. Type of Result The result returned is a double-precision, G-floating-point value greater than or equal to 0.0. Accuracy of Result The result is exact. Algorithm Used GABS(x) is calculated as follows. If x ~ 0.0 GABS(x) = x If x < 0.0 GABS(x) = -x Error Conditions None 9-6 TOPS-10/TOPS-20 Common Math Library Reference Manual CABS Description The CABS routine returns the single-precision, floating-point absolute value of its complex, single-precision; floating-point argument. That is: CABS(z) = Izl Routines Called CABS calls the SQRT and MTHERR routines. Type of Argument The argument must be a complex, single-precision, f10ating-point value; it can be any such value. Type of Result The result returned is a single-precision, floating-point value greater than or equal to 0.0. Accuracy of Result test interval: -1.00000xl0 18 through 1.00000xl018 real -1.00000xl018 through 1.00000xl018 imaginary MRE: 1.84xlO-8 (25.7 bits) RMS: 5.36xIQ-9 (27.5 bits) LSB error distribution: -2 -1 o 0% 14% 65% +1 21% +2 0% Algorithm Used CABS(z) is calculated as follows. Let z = x+i·y v = MAX(lxl,lyl) w = MIN(lxl,lyl) Then CABS(z) = v·v 1.0+ (w/v)2 Error Conditions If the argument is so large that it causes an overflow, the following message is issued and the result is set to +machine infinity. CABS: Result overflow Absolute Value Routines 9-7 CDABS Description The CDABS routine calculates the double-precision, D-floating-point absolute value of its complex, double-precision, D-floating-point argument. That IS: CDABS(z) = Izl z = location of input value Routines Called CDABS calls the DSQRT and MTHERR routines. Type of Argument The argument must be a two-element, double-precision vector that contains the input value, (z). Z must be a complex, double-precision, D-floating-point value; it can be any such value. Type of Result The result returned is a double-precision, D-floating-point value greater than or equal to 0.0. Accuracy of Result test interval: -1.00000x1018 through 1.00000x1()18 real -1.00000x1018 through 1.00000x1()18 imaginary MRE: 6.32x1o- 19 (60.5 bits) RMS: 1.89x1o-19 (62.2 bits) LSB error distribution: -2 -1 o 0% 4% 56% +1 38% +2 2% Algorithm Used CDABS(z) is calculated as follows. Let z = x+i·y v = MAX(lxl,lyl) w = MIN(lxl,lyl) Then CDABS(z) = v ·v 1.0+ (w/v)2 Error Conditions If the argument is so large that overflow occurs, the' following message is issued and the result is set to +machine infinity. CDABS: Result overflow 9-8 TOPS-10/TOPS-20 Common Math Library Reference Manual CGABS Description The CGABS routine calculates the double-precision, G-floating-point absolute value of its complex, double-precision, G-floating argument. That is: CGABS(z) = Izl z = location of input value Routines Called CGABS calls the GSQRT and MTHERR routines. Type of Argument The argument must be a two-element, double-precision vector that contains the input value (z). Z must be a complex, double-precision, G-floating-point value; it can be any such value. Type of Result The result returned is a double-precision, G~floating-point value greater than or equal to 0.0. Accuracy of Result test interval: -1.00000x10 18 through 1.00000x1018 real -1.00000x1018 through 1.00000x1018 imaginary MRE: 4.88x10- 18 (57.5 bits) RMS: 1.51x10- 18 (59.2 bits) LSB error distribution: -2 -1 o +1 0% 4% 56% 38% +2 2% Algorithm Used CGABS(z) is calculated as follows. Let z = x+i·y v = MAX(lxl,lyl) w = MIN(lxl,lyl) Then CGABS(z) = v • .J 1.0+ (w/v)2 Error Conditions If the argument is so large that overflow occurs, the following message is issued and the result is set to +machine infinity. CGABS: Result overflow Absolute Value Routines 9-9 Chapter 10 Data Type Conversion Routines IFIX Description The IFIX routine converts and truncates its single-precision, floating-point argument to an integer value. Routines Called None Type of Argument The argument must be a single-precision, floating-point value less than 235. Type of Result The result returned is an integer value; it may be any such value. Accuracy of Result The result is exact. Algorithm Used IFIX(x) is calculated by means of the FIX machine instruction. This instruction converts and truncates the argument to an integer. Error Conditions If the argument is greater than 235 , an overflow occurs and the result is set to machine infinity. Data Type Conversion Routines 10-3 INT Description The INT routine converts and truncates its single-precision, floating-point argument to an integer value. Routines Called None Type of Argument The argument must be a single-precision, floating-point value less than 235 • Type of Result The result returned is an integer value; it may be any such value. Accuracy of Result The result is exact. Algorithm Used INT(x) is calculated by means of the FIX machine instruction. This instruction converts and truncates the argument to an integer. Error Conditions If the argument is greater than 235 , an overflow occurs and the result is set to machine infinity. 10-4 TOPS-10/TOPS-20 Common Math Library Reference Manual IOINT Description The IDINT routine converts and truncates its double-precision, D-floatingpoint argument to an integer value. Routines Called None Type of Argument The argument must be a double-precision, D-floating-point value; it can be any such value. Type of Result The result returned is an integer value; it may be any such value. Accuracy of Result The result is exact. Algorithm Used IDINT(x) is calculated as follows. The routine, working on the magnitude of the argument, copies the exponent field to a scratch register. It then clears the exponent field of the magnitude of the argument, and uses the copy of the exponent to control a shift to leave the integer in the location of the r~sult. If necessary, the routine negates the result. Error Conditions If the shift results in a loss of significant bits on the left, an overflow occurs and the result is set to machine infinity. Data Type Conversion Routines 10-5 GFX.n Description The GFX.n routine converts and truncates its double-precision, G-floatingpoint argument to an integer value. n is an even octal number from 0 through 14 that designates a register (AC). Routines Called None Calling Sequence GFX.n is not called like most of the other routines in the library (see Section 1.4.1). It is called by: EXTEND n, GFX.n Type of Argument The argument must be a double-precision, G-floating-point value less than 235. It must be stored in the AC specified in the routine name. Type of Result The result returned is an integer value; it may be any such value. It is returned in the AC specified in the routine name. Accuracy of Result The result is exact. Algorithm Used GFX.n(x) is calculated by means of the GFIX machine instruction. This instruction converts and truncates the argument to an integer. Error Conditions If the argument is greater than 235 , an overflow occurs and the result is set to machine infinity. 10-6 TOPS-10/TOPS-20 Common Math Library Reference Manual REAL Description The REAL routine converts and rounds its integer argument into a singleprecision, floating-point value. Routines Called None Type of Argument The argument must be an integer value; it can be any such value. Type of Result The result returned is a single-precision, floating-point value less than 235 • Accuracy of Result The result is rounded with an error bound of half a least significant bit. Algorithm Used REAL(n) is calculated by means of the FLTR machine instruction. This instruction converts and rounds the argument to a single-precision, floatingpoint value. Error Conditions None Data Type Conversion Routines 10-7 FLOAT Description The FLOAT routine converts and rounds its integer argument to a singleprecision, floating-point value. Routines Called None Type of Argument The argument must be an integer value; it can be any such value. Type of Result The result returned is a single-precision, floating-point value less than 2::15. Accuracy of Result The result is rounded with an error bound of half a least significant bit. Algorithm Used FLOAT(n) is calculated by means of the FLTR machine instruction. This instruction converts and rounds the argument to a single-precision floatingpoint value. Error Conditions None 10-8 TOPS-10/TOPS-20 Common Math Library Reference Manual SNGL Description The SNGL routine converts and rounds its double-precision, D-floating-point argument to a single-precision, floating-point value. Routines Called None Type of Argument The argument must be a double-precision, D-floating-point value; it can be any such value. Type of Result The result returned is a single-precision, floating-point value; it may be any such value. Accuracy of Result The result is accurate to half a least significant bit because of rounding. Algorithm Used SNGL(x) is calculated as follows. The routine tests the most significant bit of the low word of the magnitude of the argument. If it is 0, the high word is returned. If it is 1, the low bit of the high word of the magnitude is tested. If it is 0, it is made 1 and negated if necessary. If it is 1, the high word of the magnitude is incremented and negated if necessary. Error Conditions If overflow occurs, the result is set to machine infinity. Data Type Conversion Routines 10-9 GSN.n Description The GSN.n routine converts and rounds its double-precision, G-floating-point argument to a single-precision, floating-point value. n is an even octal number from through 14 that designates a register (AC). ° Routines Called None Calling Sequence GSN.n is not called like most of the other routines in the library (see Section 1.4.1). It is called by: EXTEND n GSN.n Type of Argument The argument must be a double-precision, G-floating-point value; it can be any such value. It must be etored in the AC specified in the routine name. Type of Result The result returned is a single-precision, floating-point value; it may be any such value. It is returned in the AC specified in the routine name. Accuracy of Result The result is exact to half a least significant bit because of rounding. Algorithm Used GSN .n(x) is calculated as follows. The routine tests the most significant bit of the low word of the magnitude of the argument. If it is 0, the high word is returned. If it is 1, the low bit of the high word of the magnitude is tested. If it is 0, it is made 1 and negated if necessary. If it is 1, the high word of the magnitude is incremented and ~egated if necessary. Error Conditions 1. If overflow occurs, the result is set to machine inf~nity. 2. If underflow occurs, the result is set to 0.0. 10-10 TOPS-10/TOPS-20 Common Math Library Reference Manual DFLOAT Description The DFLOAT routine converts its integer argument to a double-precision, D-floating-point value. Routines Called None Type of Argument The argument must be an integer value; it can be any such value. Type of Result The result returned is a double-precision, D-floating-point value less than 2:35 • Accuracy of Result The result is exact. Algorithm Used DFLOAT(n) is calculated by moving the value of the argument to the locations used by a double~precision result. See Chapter 1 for a discussion of the location of the result. Error Conditions None Data Type Conversion Routines 10-11 DBlE Description The DBLE routine converts its single-precision floating-point argument to a double-precision, D-floating-point value. Routines Called None Type of Argument l The argument must be a single-precision, floating-point value; it can be any such value. Type of Result The result returned is a double-precision, D-floating-point value; it may be any such value. Accuracy of Result The result is exact. Algorithm Used DBLE(x) is calculated by moving the value of the argument to the locations used by a double-precision result. (See Chapter 1 for a discussion of the location of the result.) The low order word is set to O. Error Conditions None 10-12 TOPS-10/TOPS-20 Common Math Library Reference Manual GTOD Description The GTOD routine converts its double-precision, G-floating point argument to a double-precision, D-floating-point value. Routines Called GTOD calls the MTHERR routine. Type of Argument The argument must be a double-precision G-floating-point value; it can be any such value. Type of Result The result returned is a double-precision, D-floating-point value; it may be any such value. Accuracy of Result The result is exact. Algorithm Used GTOD(x) is calculated by converting the double-precision, G-floating-point value to double-precision, D-floating point and setting the low-order three bits to O. Error Conditions 1. If the resulting exponent is too small to be represented as a doubleprecision, D-floating-point number, the following message is issued and the result is set to 0.0. GTOD: Result underflow 2. If the resulting exponent is too large to be represented as a double- precision, D-floating-point number, the following message is issued and the result is set to + machine infinity. GTOD: Result overflow Data Type Conversion Routines 10-13 GTODA Description The GTODA subroutine converts an array of double-precision, G-floatingpoint values to an array of double-precision, D-floating-point values. It is called as: GTODA (x,y,i) x = input array y = array used for result i = number of elem.ents to convert Routines Called GTODA calls the MTHERR routine. Type of Arguments GTODA is a subroutine that is called with three arguments. The first and second arguments must be double-precision arrays. The third argument must be an integer value representing the number of elements to be converted. The first array (x) contains the input values; the second array (y) will contain the results. The input values must be double-precision, G-floating-point values; they can be any such values. Type of Result The result returned is an array of double-precision, D-floating-point values; they may be any such values. They are returned in the second array (y) supplied in the call. Accuracy of Result The result is exact for each value converted. Algorithm Used GTODA(x) is calculated as follows. Using the number specified in the third argument, GTODA converts each double-precision, G-floating-point value to a double-precision, D-floatingpoint value and sets the low-order three bits to O. Each converted value is stored in the second array. Error Conditions 1. For each resulting exponent that is too small to be represented as a double-precision, D-floating-point number, the following message is issued and the result is set to 0.0. GTODA: Result underflow 2. For each resultIng exponent that is too large to be represented as a double- precision, D-floating-point number, the following message is issued and the result is set to +machine infinity. GTODA: Result overflow 10-14 TOPS-10/TOPS-20 Common Math Library Reference Manual GFL.n Description The GFL.n· routine converts its integer argument to a double-precision, G-floating-point value. n is an even octal number from 0 through 14 that designates a register (AC). Routines Called None Calling Sequence GFL.n is not called like most of the routines in the library (see Section 1.4.1). It is called by: EXTEND n, GFL.n Type of Argument The argument must be an integer value; it can be any such value. It must be stored in the AC specified in the routine name. Type of Result The result returned is a double-precision, G-floating-point value less than 2:ll). It is returned in the AC specified in the routine name. Accur:::y of Result The result is exact. Algorithm Used GFL.n(n) is calculated by moving the value of the argument to the locations used by a double-precision result (see Chapter 1). Error Conditions None Data Type Conversion Routines 10-15 GOB.n· Description The GDB.n routine converts its single-precision, floating-point argument to a double-precision, G-floating-point value. n is an even octal number from 0 through 14 that designates a register (AC). Routines Called None Calling Sequence GDB.n is not called like most of the routines in the library (see Section 1.4.1). It is called by: EXTEND n, GDB.n Type of Argument The argument must be a single-precision, floating-point value; it can be any such value. It must be stored in the AC specified in the routine name. Type of Result The result returned is a double-precision, G-floating-point value; it may be any such value. It is returned in the AC specified in the routine name. Accuracy of Result The result is exact. Algorithm Used GDB.n(x) is calculated as follows. The routine uses the GDBLE machine instruction to convert the argument and move it to the locations used for double-precision results. Error Conditions None 10-16 TOPS-10/TOPS-20 Common Math Library Reference Manual DTOG Description The DTOG routine converts its double-precision, D-floating-point argument to a double-precision, G-floating-point value. Routines Called None Type of Argument The argument must be a double-precision, D-floating-point value; it can be any such value. Type of Result The result returned is a double-precision, G-floating-point value; it may be any such value. Accuracy of Result The result is rounded with an error bound of half a least significant bit. Algorithm Used DTOG(x) is calculated by converting the double-precision, D-floating-point value to a double-precision, G-floating-point value and rounding the converted value. Error Conditions None Data Type Conversion Routines 10-17 DTOGA Description The DTOGA subroutine converts an array of double-precision, D-floatingpoint values to an array of double-precision, G-floating-point values. It is called as: DTOGA(x,y,i) x = input array y = array used for result i = number of elements to convert Routines Called None Type of Arguments DTOGA is a subroutine that is called with three arguments. The first and second arguments must be double-precision arrays. The third argument must be an integer value representing the number of elements to be converted. The first array (x) contains the input values; the second array (y) will contain the result. The input values must be double-precision, D.floating-point values; they can be any such values. Type of Result The result returned is an array of double-precision, G-floating-point values; they may be any such values. They are returned in the second array (y) supplied in the call. Accuracy of Result Each element of the result is rounded with an error bound of half a least significant bit. Algorithm Used DTOGA(x) is calculated as follows. Using the number specified in the third argument, DTOGA converts each double-precision, D-floating-point value to a double-precision, G-floatingpoint value and rounds the converted value. Each converted value is stored in the second array. Error Conditions None 10-18 TOPS-10/TOPS-20 Common Math Library Reference Manual CMPL.I Description The CMPL.I routine converts its two integer arguments into a complex, single-precision, floating-point value. Routines Called None Type of Arguments Both arguments must be integer values; they can be any such values. Type of Result The result returned is a complex, single-precision, floating-point value; it may be any such value. Accuracy of Result The result is rounded with an error bound of half a least significant bit for each part (real and imaginary). Algorithm Used CMPL.I(n,m) is calculated as follows. The two arguments are converted to single-precision, floating-point values using the FLTR machine instructions. These values are then moved to the locations where the result is stored as a complex value (see Chapter 1). The first argument is used as the real part of the complex number and the second argument as the imaginary part. Error Conditions None Data Type Conversion Routines 10-19 CMPLX Description The CMPLX routine converts two single-precision arguments into one complex single-precision, floating-point value. Routines Called None Type of Arguments Both arguments must be single-precision, floating-point values; they can be any such values. Type of Result The result returned is a complex, single-precision, floating-point value; it may be any such value. Accuracy of Result The result is exact. Algorithm Used CMPLX(x,y) is calculated by moving the arguments to the locations used for a complex result (see Chapter 1). The first argument is used as the real part of the complex number and the second argument as the imaginary part. Error Conditions None 10-20 TOPS-10/TOPS-20 Common Math Library Reference Manual CMPL.D Description The CMPL.D routine converts its two double-precision, D-floating-point arguments into a complex, single-precision, floating-point value. Routines Called None Type of Arguments The arguments must be double-precision, D-floating-point values; they can be any such values. Type of Result The result returned is a complex, single-precision, floating-point value; it may be any such value. Accuracy of Result The result is accurate to half a least significant bit for each part because of rounding. Algorithm Used CMPL.D(x,y) is calculated by converting the arguments to single-precision and then moving them to the locations used for the real and imaginary parts of the complex result (see Chapter 1). The first argument is used as the real part of the complex number and the second argument as the imaginary part. Error Conditions If overflow occurs on the conversions, the result is set to machine infinity for either or both of the parts of the result. Data Type Conversion Routines 10-21 CMPL.G Description The CMPL.G routine converts its two double-precision, G-floating-point arguments into a complex, single-precision, floating-point value. Routines Called None Type of Arguments The arguments must be double-precision, G-floating-point values; they can be any such values. Type of Result The result returned is a complex, single-precision, floating-point value; it may be any such value. Accuracy of Result The result is accurate to half a least significant bit for each part because of rounding. Algorithm Used CMPL.G(x,y) is calculated by converting the arguments to single-precision and then moving them to the locations used for the real and imaginary parts of the complex result (see Chapter 1). The first argument is used as the real part of the complex number and the second argument as the imaginary part. Error Conditions 1. If overflow occurs on the conversions, the result is set to machine infinity for either or both of the parts of the result. 2. If underflow occurs on the conversions, the result is set to 0.0 for either or both parts of the result. 10-22 TOPS- 10/TOPS-20 Common Math Library Reference Manual CMPL.C Description The CMPL.C routine creates a complex, single-precision, floating-point value from the real parts of two complex, single-precision, floating-point values. Routines Called None Type of Arguments The arguments must be complex, single-precision, floating-point values; they can be any such values. Type of Result The result returned is a complex, single-precision, floating-point value; it may be any such value. Accuracy of Result The result is exact. Algorithm Used CMPL.C(z,g) is calculated by moving the arguments to the locations used for a complex result (see Chapter 1). The first argument is used as the real part of the complex number and the second argument as the imaginary part. Error Conditions None Data Type Conversion Routi nes 10-23 Chapter 11 Rounding and Truncation Routines NINT Description The NINT routine rounds its single-precision, floating-point argument to the nearest integer. Routines Called NINT calls the MTHERR routine. Type of Argument The argument must be a single-precision, floating-point value; it can be any such value. Type of Result The result returned is an integer value; it may be any such value. Accuracy of Result The result is exact. Algorithm Used NINT(x) is calculated as follows. Let j = INT(lxl+.5) If j < 235 and If x 2: 0.0 NINT(x) = j If x < 0.0 NINT(x) = -j If j = 235 and If x < 0.0 NINT(x) = -j Otherwise, overflow occurs and If x > 0.0 NINT(x) = 235 -1 If x < 0.0 NINT(x) = _2 35 Error Conditions If x is greater than or equal to 235 or less than -2 35 , the result overflows. When overflow occurs, the following message is issued and the result is set to +machine infinity if x is greater than 0.0 or to -machine infinity if x is less than 0.0. NINT: Result overflow Rounding and Truncation Routines 11-3 IDNINT Description The IDNINT routine rounds its double-predsion, D-floating-point argument to the nearest integer. Routines Called IDNINT calls the MTHERR routine. Type of Argument The argument must be a double-precision, D-floating-point value; it can be any such value. Type of Result The result returned is an integer value; it may be any such value. Accuracy of Result The result is exact. Algorithm Used IDNINT(x) is calculated as follows. Let j = INT(lxl+.5) If j < 235 and If x ;?: 0.0 IDNINT(x) = j If x < 0.0 IDNINT(x) = -j If j = 235 and If x < 0.0 IDNINT(x) = -j Otherwise, overflow occurs and If x > 0.0 IDNINT(x) = 235 _1 If x < 0.0 IDNINT(x) = _2 35 Error Conditions If x is greater than or equal to 235 or less than _2 35 , the result overflows. When overflow occurs, the following message is issued and the result is set to +machine infinity if x is greater than 0.0 or to -machine infinity if x is less than 0.0. IONINT: Result overflow 11-4 TOPS-10/TOPS-20 Common Math Library Reference Manual IGNIN. Description The IGNIN. routine rounds its double-precision, G-floating-point argument to the nearest integer. Routines Called IGNIN. calls the MTHERR routine. Type of Argument The argument must be a double-precision, G-floating-point value; it can be any such value. Type of Result The result returned is an integer value; it may be any such value. Accuracy of Result The result is exact. Algorithm Used IGNIN.(x} is calculated as follows. Let j = INT(lxl+.5) If j < 235 and If x ~ 0.0 IGNIN.(x) = j If x < 0.0 IGNIN.(x) = -j If j = 23fi and If x < 0.0 IGNIN.(x) = -j Otherwise, overflow occurs and If x > 0.0 IGNIN.(x) = 2:15_1 If x < 0.0 IGNIN .(x) = _2 35 Error Conditions If x is greater than or equal to 23fi or less than _2 35 , the result overflows. When overflow occurs, the following message is issued and the result is set to +machine infinity if x is greater than 0.0 or - machine infinity if x is less than 0.0. IGNIN.: Result overflow Rounding and Truncation Routines 11-5 ANINT Description The ANINT routine rounds its single-precision, floating-point argument to the nearest single-precision, floating-point whole number. Routines Called None Type of Argument The argument must be a single-precision, floating-point value; it can be any such value. Type of Return The result returned is a single-precision, floating-point whole value; it may be any such value. Accuracy of Result The result is exact. Algorithm Used ANINT(x) is calculated as follows. If Ixl :;::: 226 ANINT(x) = x because x is an integer If Ixl < 226 If X > 0.0 ANINT(x) = ((lxl+2 26 )rounded)-2 26 If x < 0.0 ANINT(x) = -(((lxl+2 26 )rounded)-2 26 ) Error Conditions None 11-6 TOPS-10/TOPS-20 Common Math Library Reference Manual DNINT Description The DNINT routine rounds its double-precision, D-floating-point argument to the nearest double-precision, D-floating-point whole number. Routines Called None Type of Argument The argument must be a double-precision, D-floating-point value; it can be any such value. Type of Result The result returned is a double-precision, D-floating-point whole value; it may be any such value. Accuracy of Result The result is exact. Algorithm Used DNINT is calculated as follows. If Ixl2 261 DNINT(x) = x because x is an integer If Ixl < 261 If x> 0.0 DNINT(x) = «lxl+2 61 )rounded)-261 If x < 0.0 DNINT(x) = -«(lxl+2 61 )rounded)-261 ) Error Conditions None Rounding and Truncation Routines 11-7 GNINT. Description The GNINT. routine rounds its double-precision, G-floating-point argument to the nearest double-precision, G-floating-point whole number. Routines Called None Type of Argument The argument must be a double-precision, G-floating-point value; it can be any such value. Type of Result The result returned is a double-precision, G-floating-point whole value; it may be any such value. Accuracy of Result The result is exact. Algorithm Used GNINT.(x) is calculated as follows. If Ixl ~ 258 GNINT.(x) = x because x is an integer If Ixl < 258 If X > 0.0 GNINT.(x) = «lxl+2 58 )rounded)-258 If x < 0.0 GNINT.(x) = -«(lxl+2 58 )rounded)-2 58 ) Error Conditions None 11-8 TOPS-10/TOPS-20 Common Math Library Reference Manual AINT Description The AINT routine truncates its single-precision, floating-point argument to a single-precision, floating-point whole number. Routines Called None Type of Argument The argument must be a single-precision, floating-point value; it can be any such value. Type of Result The result returned is a single-precision, floating-point whole value; it may be any such value. Accuracy of Result The result is exact. Algorithm Used AINT(x) is calculated as follows. If Ixl ;? 226 AINT(x) = x because x is an integer If Ixl < 226 If x> 0.0 AINT(x) = ((Ixl +2 26 )truncated)-2 26 If x < 0.0 AINT(x) = -( ((lxl+2 26 )truncated)-2 26 ) Error Conditions None Rounding and Truncation Routines 11-9 DINT Description The DINT routine truncates its double-precision, D-floating-point argument to a double-precision, D-floating-point whole number. Routines Called None Type of Argument The argument must be a double-precision, D-floating-point value; it can be any such value. Type of Result The result returned is a double-precision, D-floating-point whole value; it may be any such value. Accuracy of Result The result is exact. Algorithm Used DINT(x) is calculated as follows. If Ixl ~ 261 DINT(x) = x because x is an integer If Ixl < 1.0 DINT(x) = 0.0 Otherwise DINT(x) = sgn(x) -(Ixl with fraction bits replaced by zeroes) Error Conditions None 11-10 TOPS-10/TOPS-20 Common Math Library Reference Manual GINT. Description The GINT. routine truncates its double-precision, G-floating-point argument to a double-precision, G-floating-point whole number. Routines Called None Type of Argument The argument must be a double-precision, G-floating-point value; it can be any such value. Type of Result The result returned is a double-precision, G-floating-point whole value; it may be any such value. Accuracy of Result The result is exact. Algorithm Used GINT.(x) is calculated as follows. If Ixl ;::: 258 GINT.(x) = x because x is an integer If Ixl < 1.0 GINT.(x) = 0.0 Otherwise GINT.(x) = sgn(x) -(Ixl with fraction bits replaced by zeroes) Error Conditions None Rounding and Truncation Routines 11-11 Chapter 12 Product, Remainder, and Positive Difference Routines DPROD Description The DPROD routine multiplies two single-precision, floating-point numbers and returns a double-precision, D-floating-point product. That is: DPROD(x,y) = x·y Routines Called OPROD calls the MTHERR routine. Type of Arguments Both arguments must be single-precision, floating-point values; they can be any such values. Type of Result The result returned is a double-precision, D-floating-point value; it may be any such value. Accuracy of Result The result is exact. Algorithm Used DPROD(x,y) is calculated as follows. Let x = DBLE(x) y = DBLE(y) DPROD(x,y) = x·y Error Conditions 1. If overflow occurs, the following message is issued and the result is set to ±machine infinity. DPROD: Result overflow 2. If underflow occurs, the following message is issued and the result is set to 0.0. DPROD: Result underflow Product, Remainder, and Positive Difference Routines 12-3 GPROD. Description The GPROD. routine multiplies two single-precision, floating-point numbers and returns a double-precision, G-floating-point product. That is: GPROD.(x,y) = x·y Routines Called GPROD. calls the MTHERR routine. Type of Arguments Both arguments must be single-precision, floating-point values; they can be any such values. Type of Result The result returned is a double-precision, G-floating-point value; it may be any such value. Accuracy of Result The result is exact. Algorithm Used GPROD.(x,y) is calculated as follows. Let x = GDB.O(x) y = GDB.O(y) GPROD.(x,y) = x·y Error Conditions None 12-4 TOPS-10/TOPS-20 Common Math Library Reference Manual MOD Description The MOD routine returns the integer remainder of the quotient of its integer arguments. That is: MOD(i,j) = i-[i/j] ej Routines Called None Type of Arguments Both arguments must be integer; the second argument cannot equal zero. If the first argument is negative, the result is negative. Type of Result The result returned is an integer value in the range -Ijl to Ijl. Accuracy of Result The result is exact. Algorithm Used MOD(i,j) is calculated as follows. MOD(i,j) = (Iil-[I il/j] ej). sgn(i) [Iil/j] = the greatest integer in lil/j Error Conditions None Product, Remainder, and Positive Difference Routines 12-5 AMOD Description The AMOD routine returns the single-precision, floating-point remainder of the quotient of its single-precision, floating-point arguments. That is: AMOD(x,y) = x-[x/y]·y Routines Called AMOD calls the MTHERR routine. Type of Arguments Both arguments must be single-precision, floating-point values; the second argument cannot equal zero. If the first argument is negative, the result will be negative. Type of Result The result returned is a single-precision, floating-point value in the range - Iyl to Iyl. Accuracy of Result The result is exact. Algorithm Used AMOD(x,y) is calculated as follows. AMOD(x,y) = (Ixl-[lxl/y] .y) ·sgn(x) [I x I/y] = largest integer in Ixl/y Error Conditions Underflow may occur if y is too small a number. If underflow occurs, the following message is issued and the result is set to 0.0. AMOD: Result underflow 12-6 TOPS-10/TOPS-20 Common Math Library Reference Manual DMOD Description The DMOD routIne returns the double-precision, D-floating-point remainder of the quotient of its double-precision, D-floating-point arguments. That is: DMOD(x,y) = x-[x/y]·y Routines Called DMOD calls the MTHERR routine. Type of Arguments Both arguments must be double-precision, D-floating-point val~es; the second argument cannot equal zero. If the first argument is negative, the result will be negative. Type of Result The result returned is a double-precision, D-floating-point value in the range - Iyl to Iyl. Accuracy of Result The, result is exact. Algorithm Used DMOD(x,y) is calculated as follows. DMOD(x,y) = (Ixl-[Ixl/y] .y) ·sgn(x) [Ixl/y] = largest integer in Ixl/y Error Conditions Underflow may occur if y is too small a number. If underflow occurs, the following message is issued and the result is set to 0.0. DMOD: Result underflow Product, Remainder, and Positive Difference Routines 12-7 GMOD Description The GMOD routine returns the double-precision, G-floating-point remainder of the quotient of its double-precision, G-floating-point arguments. That is: GMOD(x,y) = x-[x/y]·y Routines Called GMOD calls the MTHERR routine. Type of Arguments Both arguments must be double-precision, G-floating-point values; the second argument cannot equal zero. If the first argument is negative, the result will be negative. Type of Result The result returned is a double-precision, G-floating-point value in the range - Iyl to Iyl. Accuracy of Result The result is exact. Algorithm Used GMOD(x,y) is calculated as follows. GMOD(x,y) = (Ixl-[Ixl/y] .y). sgn(x) [lxl/y] = largest integer in Ixl/y Error Conditions Underflow nlay occur if y is too small a number. If underflow occurs, the following message is issued and the result is set to 0.0. GMOD: Result underflow 12-8 TOPS-10/TOPS-20 Common Math Library Reference Manual 101M Description The IDIM routine returns the integer difference between its integer argU>l ments, provided that the difference is positive. If the difference is negative, IDIM returns zero. That is: IDIM(i,j) = i-j Routines Called IDIM calls the MTHERR routine. Type of Arguments Both arguments JIlust be integer values; they can be any such values. Type of Result The result returned is an integer value greater than or equal to O. Accuracy of Result The result is exact. Algorithm Used IDIM is calculated as follows. If i ~ j IDIM(i,j) = 0 If i > j IDIM(i,j) = i-j Error Conditions If overflow occurs during subtraction, the following message is issued and the result is set to machine infinity. 101M: Result overflow Product, Remainder, and Positive Difference Routines 12-9 DIM Description The DIM routine returns the single-precision, floating-point difference between its single-precision, floating-point arguments, provided that the difference is positive. If the difference is negative, DIM returns zero. That is: DIM(x,y) = x-y Routines Called DIM calls the MTHERR routine. Type of Arguments Both arguments must be single-precision, floating-point values; they can be any such values. Type of Result The result returned is a single-precision, floating-point value greater than or equal to 0.0. Accuracy of Result The result is rounded with an error bound of half a least significant hit. Algorithm Used DIM(x,y) is calculated as follows. IfxsY DIM(x,y) = 0.0 If x > y DIM(x,y) = x-y Error Conditions 1. If overflow occurs during subtraction, the following message is issued and the result is set to machine infinity. 01 M: Result overflow 2. If underflow occurs during subtraction, the following message is issued and the result is set to 0.0. DIM: Result underflow 12-10 TOPS-10/TOPS-20 Common Math Library Reference Manual DDIM Description The DDIM routine returns the double-precision, D-floating-point difference between its double-precision, D-floating-point arguments, provided that the difference is positive. If the difference is negative, DDIM returns zero. That is: DDIM(x,y) = x-y Routines Called DDIM calls the MTHERR routine. Type of Arguments Both arguments must be double-precision, D-floating-point values; they can be any such values. Type of Result The result returned is a double-precision, D-floating-point value greater than or equal to 0.0. Accuracy of Result The result is rounded with an error bound of half a least significant bit. Algorithm Used DDIM(x,y) is calculated as follows. Ifx:5Y DDIM(x,y) = 0.0 If x > y DDIM(x,y) = x-y Error Conditions 1. If overflow occurs during subtraction, the following message is issued and the result is set to machine infinity. DDIM: Result overflow 2. If underflow occurs during subtraction, the following message is issued and the result is set to 0.0. DDIM: Result underflow Product, Remainder, and Positive Difference Routines 12-11 GDIM Description The GDIM routine returns the double-precision, G-floating-point difference between its double-precision, G-floating-point arguments, provided that the difference is positive. If the difference is negative, GDIM returns zero. That is: GDIM(x,y) = x-y Routines Called GDIM calls the MTHERR routine. Type of Arguments Both arguments must be double-precision, G-floating-point values; they can be any such values. Type of Result The result returned is a double-precision, G-floating-point value greater than or equal to 0.0. Accuracy of Result The result is rounded with an error bound of half a least significant bit. Algorithm Used GDIM(x,y) is calculated as follows. If x::::; y GDIM(x,y) = 0.0 If x > y GDIM(x,y) = x-y Error Conditions 1. If overflow occurs during subtraction, the following message is issued and the result is set to machine infinity. GDIM: Result overflow 2. If underflow occurs during subtraction, the following message is issued and the result is set to 0.0. GDIM: Result underflow 12-12 TOPS-10/TOPS-20 Common Math Library Reference Manual Chapter 13 Transfer of Sign Routines ISIGN Description The ISIGN routine transfers the sign of its integer second argument to its integer first argument, ignoring the sign of the first argument. That is: ISIGN (i,j) = lilesgn(j) Routines Called ISIGN calls the MTHERR routine. Type of Arguments Both arguments must be integer values; they can be any such values. Type of Result The result returned is an integer value; it has the same magnitude as the first argument. Accuracy of Result The result is exact. Algorithm Used ISIGN(i,j) is calculated as follows. ISIGN(i,j) = lilesgn(j) If j 2:: 0 ISIGN(i,j) = Iii If j < 0 ISIGN(i,j) = -Iii Error Conditions If i = _2 35 and j > 0, overflow occurs. If overflow occurs, the following message is issued and the result is set to machine infinity. ISIGN: Result overflow Transfer of Sign Routines 13-3 SIGN Description The SIGN routine transfers the sign of its single-precision, floating-point second argument to its single-precision, floating-point first argument, ignoring the sign of the first argument. That is: SIGN (x,y) = Ixlesgn(y) Routines Called None Type of Arguments Both arguments must be single-precision, floating-point values; they can be any such values. Type of Result The result returned is a single-precision, floating-point value; it has the same magnitude as the first argument. Accuracy of Result The result is exact. Algorithm Used SIGN(x,y) is calculated as follows. SIGN(x,y) = Ixlesgn(y) If y;? 0.0 SIGN(x,y) = Ixl If y < 0.0 SIGN(x,y) = -Ixl Error Conditions None 13-4 TOPS-10/TOPS-20 Common Math Library Reference Manual DSIGN Description The DSIGN routine transfers the sign of its double-precision, D-floating-point second argument to its double-precision, D-floating-point first argument, ignoring the sign of the first argument. That is: DSIGN(x,y) = Ixlesgn(y) Routines Called None Type of Arguments Both arguments must be double-precision, D-floating-point values; they can be any such values. Type of Result The result returned is a double-precision, D-floating-point value; it has the same magnitude as the first argument. Accuracy of Result The result is exact. Algorithm Used DSIGN(x,y) is calculated as follows. DSIGN(x,y) = Ixlesgn(y) If y ~ 0.0 DSIGN(x,y) = Ixl If y < 0.0 DSIGN(x,y) = -Ixl Error Conditions None Transfer of Sign Routines 13-5 GSIGN Description The GSIGN routine transfers the sign of its double-precision, G-floating-point second argument to its double-precision, G-floating-point first argument, ignoring the sign of the first argument. That is: GSIGN(x,y) = Ixl-sgn(y) Routines Called None Type of Arguments Both arguments must be double-precision, G-floating-point values; they can be any such values. Type of Result The result returned is a double-precision, G-floating-point value; it has the same magnitude as the first argument. Accuracy of Result The result is exact. Algorithm Used GSIGN(x,y) is calculated as follows. GSIGN(x,y) = Ixl-sgn(y) If y ~ 0.0 GSIGN(x,y) = Ixl If y < 0.0 GSIGN(x,y) = -Ixl Error Conditions None 13-6 TOPS-10/TOPS-20 Common Math Library Reference Manual Chapter 14 Maximum/Minimum Routines MAXO Description The MAXO routine finds the integer maximum of a series of integer arguments. Routines Called None Type of Arguments All the arguments must be integer values; they can be any such values. There can be as many arguments as desired. Type of Result The result returned is an integer value; it is the largest value in the series. Accuracy of Result The result is exact. Algorithm Used MAXO(i, ... j) is calculated as follows. The MAXO routine compares each argument in succession with the current largest argument, which is held in a register. Each time an argument exceeds the current largest argument, the register is updated. This loop continues until the final argument is processed. The contents of the register are then returned as the result. Error Conditions None Maximum/Minimum Routines 14-3 MAX1 Description The MAXI routine finds the integer maximum of a series of single-precision, floating-point arguments. Routines Called None Type of Arguments All the arguments must be single-precision, floating-point values; they can be any such values. There can be as many arguments as desired. Type of Result The result returned is the largest value in the series converted to integer format. Accuracy of Result The result is exact except for possible overflow during the conversion to integer. Algorithm Used MAXI(x, ... y) is calculated as follows. The MAXI routine compares each argument in succession with the current largest argument, which is held in a register. Each time an argument exceeds the current largest argument, the register is updated. This loop continues until the final argument is processed. The contents of the register are then converted to integer format and returned as the result. Error Conditions Overflow can occur during conversion to integer. If overflow occurs, the result is set to ± machine infinity. 14-4 TOPS-10/TOPS-20 Common Math Library Reference Manual AMAXO Des~rlptlon The AMAXO routine finds the single-precision, floating-point maximum of a series of integer arguments. Routines Called None Type of Arguments All the arguments must be integer; they can be any such values. There can be as many arguments as desired. Type of Result The result returned is the largest value in the series converted to singleprecision, floating-point format. Accuracy of Result The result is exact unless a rounding error occurs during conversion, in which case the error could be half a least significant bit. Algorithm Used AMAXO(i, ... j) is calculated as follows. The AMAXO routine compares each argument in succession with the current largest argument, which is held in a register. Each time an argument exceeds the current largest argument, the register is updated. This loop continues until the final argument is processed. The contents of the register are then converted to single-precision, floating-point format and returned as the result. Error Conditions None Maximum/Minimum Routines 14-5 AMAX1 Description The AMAXI routine finds the single-precision, floating-point maximum of a series of single-precision, floating-point arguments. Routines Called None Type of Arguments All the arguments must be single-precision, floating-point values; they can be any such values. There can be as many arguments as desired. Type of Result The result returned is a single-precision, floating-point value; it is the largest value in the series. Accuracy of Result The result is exact. Algorithm Used AMAXl(x, ... y) is calculated as follows. The AMAXI routine compares each argument in succession with the current largest argument, which is held in a register. Each time an argument exceeds the current largest argument, the register is updated. This loop continues until the final argument is processed. The contents of the register are then returned as the result. Error Conditions None 14-6 TOPS-10/TOPS-20 Common Math Library Reference Manual DMAX1 Description The DMAXI routine finds the double-precision, D-floating-point maximum of a series of double-precision, D-floating-point arguments. Routines Called None Type of Arguments All the arguments must be double-precision, D-floating-point values; they can be any such values. There can be as many arguments as desired. Type of Result The result returned is a double-precision, D-floating-point value; it is the largest value in the series. Accuracy of Result The result is exact. Algorithm Used DMAXl(x, ... y) is calculated as follows. The DMAXI routine compares each argument in succession with the current largest argument, which is held in two registers. Each time an argument exceeds the current largest argument, the registers are updated. This loop continues until the final argument is processed. The contents of the registers are then returned as the result. Error Conditions None Maximum/Minimum Routines 14-7 GMAX1 Description The GMAXI routine finds the double-precision, G··floating-point maximurn of a series of double-precision, G-floating-point arguments. Routines Called None Type of Arguments All the arguments must be double-precision, G-floating-point values; they can be any such values. There can be as many arguments as desired. Type of Result The result returned is a double-precision, G-floating-point value; it is the largest value in the series. Accuracy of Result The result is exact. Algorithm Used GMAXl(x, ... y) is calculated as follows. The GMAXI routine compares each argument in succession with the current largest argument, which is held in two registers. Each time an argulnent exceeds the current largest argument, the registers are updated. This loop continues until the final argument is processed. The contents of the registers are then returned as the result. Error Conditions None 14-8 TOPS-10/TOPS-20 Common Math Library Reference Manual MINO Description The MINO routine finds the integer minimum of a series of integer arguments. Routines Called None Type of Arguments All the arguments must be integer values; they can be any such values. There can be as many arguments as desired. Type of Result The result returned is an integer value; it is the smallest value in the series. Accuracy of Result The result is exact. Algorithm Used MINO(i, ... j) is calculated as follows. The MINO routine compares each argument in succession to the current smallest argument, which is held in a register. Each time an argument is less than the current smallest argument, the register is updated. This loop continues until the final argument is processed. The contents of the register are then returned as the result. Error Conditions None Maximum/Minimum Routines 14-9 MIN1 Description The MINI routine finds the integer minimum of a series of single-precision, floating- point arguments. Routines Called None Type of Arguments All the arguments must be single-precision, floating-point values; they can be any such values. There can be as many arguments as desired. Type of Result The result returned is the smallest value in the series converted to integer format. Accuracy of Result The result is exact except for possible overflow during the conversion to integer. Algorithm Used MINl(x, ... y) is calculated as follows. The MINI routine compares each argument in succession with the current smallest argument, which is held in a register. Each time an argulnent is smaller than the current smallest argument, the register is updated. This loop continues until the final argument is processed. The contents of the register are then converted to integer and returned as the result. Error Conditions Overflow can occur during conversion to integer. If overflow occurs, the result is set to ± machine infinity. 14-10 TOPS-10/TOPS-20 Common Math Library Reference Manual AMINO Description The AMINO routine finds the single-precision, floating-point minimum of a series of integer arguments. Routines Called None Type of Arguments All the arguments must be integer; they can be any such values. There can be as many arguments as desired. Type of Result . The result returned is the smallest value in the series converted to singleprecision, floating-point format. Accuracy of Result The result is exact unless a rounding error occurs during conversion, in which case the error could be half a least significant bit. Algorithm Used AMINO(i, ... j) is calculated as follows. The AMINO routine compares each argument in succession with the current smallest argument, which is held in a register. Each time an argument is smaller than the current smallest argument, the register is updated. This loop continues until the final argument is processed. The contents of the register are then converted to single-precision, floating-point format and returned as the result. Error Conditions None Maximum/Minimum Routines 14-11 AMIN1 Description The AMINI routine finds the single-precision, floating-point minimum of a series of single-precision, floating-point arguments. Routines Called None Type of Arguments All the arguments must be single-precision, floating-point values; they can be any such values. There can be as many arguments as desired. Type of Result The result returned is a single-precision, floating-point value; it is the smalJest value in the series. Accuracy of Result The result is exact. Algorithm Used AMINI(x, ... y) is calculated as follows. The AMINI routine compares each argument in succession with the current smallest argument, which is held in a register. Each time an argument is smaller than the current smallest argunlent, the register is updated. This loop continues until the final argument is processed. The contents of the register are then returned as the result. Error Conditions None 14-12 TOPS-10/TOPS-20 Common Math Library Reference Manual DMIN1 Description The DMINI routine finds the double-precision, D-floating-point minimum of a series of double-precision, D-floating-point arguments. Routines Called None Type of Arguments All the arguments must be double-precision, D-floating-point values; they can be any such values. There can be as many arguments as desired. Type of Result The result returned is a double-precision, D-floating-point value; it is the smallest value in the series. Accuracy of Result The result is exact. Algorithm Used DMINl(x, ... y) is calculated as follows. The DMINI routine compares each argument in succession with the current smallest argument, which is held in two registers. Each time an argument is less than the current smallest argument, the registers are updated. This loop continues until the final argument is processed. The contents of the registers are then returned as the result. Error Conditions None Maximum/Minimum Routines 14-13 GMIN1 Description The GMINI routine finds the double-precision, G-floating-point minimum of a series of double-precision, G-floating-point arguments. Routines Called None Type of Arguments All the arguments must be double-precision, G-floating-point values; they can be any such values. There can be as many arguments as desired. Type of Result The result returned is a double-precision, G-floating-point value; it is the smallest value in the series. Accuracy of Result The result is exact. Algorithm Used GMINl(x, ... y) is calculated as follows. The GMINI routine compares each argument in succession with the current smallest argument, which is held in two registers. Each time an argument is less than the current smallest argument, the registers are updated. This loop continues until the final argument is processed. The contents of the registers are then returned as the result. Error Conditions None 14-14 TOPS-10/TOPS-20 Common Math Library Reference Manual Chapter 15 Miscellaneous Complex Routines REAL.C Descriptio n The REAL.C routine returns the real part of a complex number. That is: REAL.C(z) = REAL.C(x+i .y) = x Routines Called None Type of Argument The argument must be a complex value; it can be any such value. Type of Result The result returned is a single-precision, floating-point value. Accuracy of Result The result is exact. Algorithm Used REAL.C(z) is calculated by copying the real part of the argument to the return location. Error Conditions None Miscellaneous Complex Routines 15-3 AIMAG Description The AIMAG routine returns the imaginary part of a complex number. That is: AIMAG(z) = AIMAG(x+i .y) = y Routine'S Called None Type of Argument The argument must be a complex value; it can be any such value. Type of Result The result returned is a single-precision, floating-point value; it is the imaginary part of the number. Accuracy of Result The result is exact. Algorithm Used AIMAG(z) is calculated by copying the imaginary part of the argument to the return location. Error Conditions None 15-4 TOPS-10/TOPS-20 Common Math Library Reference Manual CONJ Description The CONJ routine finds the conjugate of a complex number. That is: CONJ(z) = conj(x+i .y) = x-i·y Routines Called None Type of Argument The argument must be a complex value; it can be any such value. Type of Result The result returned is a complex value; it is the conjugate of the argument value. Accuracy of Result The result is exact. Algorithm Used CONJ(z) is calculated as follows. Let z = x+i·y conj(x+i .y) = x+( -i .y) CONJ(z) = x-i·y Error Conditions None Miscellaneous Complex Routines 15-5 CFM Descr1ptlon The CFM subroutine finds the complex, single-precision, floating-point product of two complex, single-precision, floating-point values. That is: CFM(z,g) = zeg Routines Called CFM calls the MTHERR routine. Type of Arguments CFM is a subroutine with two arguments; both must be complex, singleprecision, floating-point values. They can be any such values. Type of Result The result returned is a complex, single-precision, floating-point value. Accuracy of Result test interval: -10000. through 10000. for z (real) -10000. through 10000. for z (imaginary) -10000. through 10000. for g (real) -10000. through 10000. for g (imaginary) MRE: 1.20x10-5 (16.4 bits) real 1.47x10-6 (19.4 bits) imaginary RMS: 2.64x10- 7 (21.9 bits) real 5.81x10-8 (24.0 bits) imaginary LSB error distribution: -4+ -3 -2 -1 a +1 +2 +3 +4+ 2% 1% 1% 14% 64% 15% 1% 1% 2% real 1% 1% 1% 15% 64% 14% 1% 1% 2% imaginary Algorithm Used CFM(z,g) is calculated as follows. Let z = a+i· b Let g = c+i·d If CFM(z,g) = (a+i· b)· (c+i· d) CFM(z,g) = (a·c-b-d)+i-(b-c+a·d) Error Conditions 1. If either part of the result overflows, the following message is issued and that part of the result is set to machine infinity. CMATH: Complex overflow 2. If either part of the result underflows, the following message is issued and that part of the result is set to 0.0. CMATH: Complex underflow 15-6 TOPS-10/TOPS-20 Common Math Library Reference Manual CFDV Description The CFDV subroutine finds the complex, single-precision, floating-point quotient of two complex, single-precision, floating-point values. That is: CFDV(z,g) = zig Routines Called CFDV calls the MTHERR routine. Type of Arguments CFDV is a subroutine with two arguments; both must be complex, singleprecision, floating-point values. They can be any such values. Type of Result The result returned is a complex, single-precision, floating-point value; it may be any such value. Accuracy of Result test interval: -10000. through 10000. for z (real) -10000. through 10000. for z (imaginary) -10000. through 10000. for g (real) -10000. through 10000. for g (imaginary) MRE: 2.87x10-7 (21.7 bits) real 7.60x10- 7 (20.3 bits) imaginary RMS: 1.33x10-8 (26.2 bits) real 2.30x10-8 (25.4 bits) imaginary LSB error distribution: -4+ -3 -2 -1 0 +1 +2 +3 +4+ 1% 1% 3% 22% 49% 21% 2% 0% 1% real 1% 1% 3% 21% 50% 20% 3% 1% 1% imaginary Algorithm Used CFDV(z,g) is calculated as follows. Let z = a+i-b Letg=c+i-d If CFDV(z,g) = (a+i-b)/(c+i-d) CFDV(z,g) = ((a-c+b-d)+i-(b-c-a-d))/(c 2 +d 2 ) Error Conditions 1. If either part of the result underflows, the following message is issued and that part of the result is set to 0.0. CMATH: Complex underflow 2. If either part of the result overflows, that part of the result is set to machine infinity. Miscellaneous Complex Routines 15-7 Appendix A ELEFUNT Test Results This appendix contains the results of the ELEFUNT tests of W. J. Cody, Argonne National Laboratory. For each test, the test interval, maximum relative error (MRE), and root mean square (RMS) relative error are given. Note that it is not meaningful to compare these test results with the test results given for each routine under the heading "Accuracy of Result." ACOS(x) vs Taylor Series test interval: -1.0000 through -0.7500 MRE: 0.1231x10-7 (26.3 bits) RMS: 0.2868x10-8 (28.4 bits) ACOS(x) vs Taylor Series test interval: 0.7500 through 1.0000 MRE: 0.1488x10-7 (26.0 bits) RMS: 0.1330x10-8 (29.5 bits) ACOS(x) vs Taylor Series test interval: -0.1250 through 0.1250 MRE: 0.1030x10-7 (26.5 bits) 8 RMS: 0.2647x10(28.5 bits) ALOG(x·x) vs 2·loge x test interval: 0.1600x102 through 0.2400x103 MRE: 0.1466xlO- 7 (26.0 bits) RMS: 0.2292x10- 8 (28.7 bits) ALOG(x) vs Taylor Series expansion of ALOG(1+y) test interval: 1-0.1953x10-2 through 1+0.1953x1o-2 MRE: 0.2466x10-7 (25.3 bits) 8 RMS: 0.6614x10(27.2 bits) ALOG(x) vs ALOG(17x/16)-ALOG(17/16) test interval: 0.7071 through 0.9375 (25.4 bits) MRE: 0.2264x10-7 R1\1S: 0.6426x10-8 (27.2 bits) A-1 ALOG10(x) vs ALOG10(11x/10)-ALOG10(11/10) test interval: 0.3162 through 0.9000 MRE: 0.3863x10-7 (24.6 bits) 7 RMS: 0.1122x10(26.4 bits) ASIN (x) vs Taylor Series test interval: 0.7500 through 1.0000 MRE: 0.1478x10-7 (26.0 bits) RMS: 0.3245x10-8 (28.2 bits) ASIN (x) vs Taylor Series test interval: -0.1250 through 0.1250 MRE: 0.1190x10-7 (26.3 bits) RMS: 0.6733x10-9 (30.5 bits) ATAN (x) vs truncated Taylor Series test interval: -0.6250x10- 1 through 0.6250xl0- 1 MRE: 0.8032x10-8 (26.9 bits) RMS: 0.1796x10-9 (32.4 bits) ATAN(x) vs ATAN(1/16)+ATAN«x-1/16)/(1+x/16» test interval: 0.6250.10- 1 through 0.2679 MRE: 0.1488x10-7 (26.0 bits) RMS: 0.6219x10-8 (27.3 bits) 2·ATAN(x) vs ATAN(2x/(1-x·x» test interval: 0.2679 through 0.4142 MRE: 0.1423x10-7 (26.1 bits) RMS: 0.6597x10-8 (27.2 bits) 2·ATAN(x) vs ATAN(2x/(1-x·x» test interval: 0.4142 through 1.0000 MRE: 0.1484xlO-7 (26.0 bits) 8 RMS: 0.3894x10(27.9 bits) COS (x) vs 4·COS(x/3)3_3·COS(x/3) test interval: 0.2199x102 through 0.2356xl02 MRE: 0.2070x10-7 (25.5 bits) RMS: 0.6463x10-8 (27.2 bits) COSH(x) vs C·(COSH(x+1)+COSH(x-1» test interval: 3.0000 through 0.8803x1p2 MRE: 0.2219x10-7 (25.4 bits) RMS: 0.7007x1o-8 (27.1 bits) COSH(x) vs Taylor Series expansion of COSH(x) test interval: 0.0000 through 0.5000 MRE: 0.1490x10-7 (26.0 bits) (27.4 bits) RMS: 0.5491x10-8 COT(x) vs (COT(x/2)2_1)/(2·COT(x/2» test interval: 0.1885x102 through 0.1963xl02 MRE: 0.2975x10-7 (25.0 bits) RMS: 0.8629x10-8 (26.8 bits) A-2 TOPS-10/TOPS-20 Common Math Library Reference Manual DACOS(x) vs Taylor Series test interval: -1.0000 through -0.7500 MRE: 0.3582x10- 18 (61.3 bits) RMS: 0.1211x10- 18 (62.8 bits) DACOS(x) vs Taylor Series test interval: -0.1250 through -0.1250 MRE: 0.3000x10- 18 (61.5 bits) RMS: 0.1224x10- 18 (62.8 bits) DACOS(x) vs Taylor Series test interval: 0.7500 through 1.0000 MRE: 0.4337x10- 18 (61.0 bits) RMS: 0.1682x10- 18 (62.4 bits) DASIN (x) vs Taylor Series test interval: -0.1250 through 0.1250 MRE: 0.4334x10- 18 (61.0 bits) RMS: 0.1715x10- 18 (62.3 bits) DASIN(x) vs Taylor Series test interval: 0.7500 through 1.0000 MRE: 0.4326x10- 18 (61.0 bits) RMS: 0.1168x1o-- 18 (62.9 bits) DATAN(x) vs truncated Taylor Series test interval: -0.6250x10- 1 through -0.6250x10- 1 MRE: 0.4326x10- 18 (61.0 bits) RMS: 0.1370x10- 18 (62.7 bits) DATAN(x) vs DATAN(1/16)+DATAN«x-1/16)/(1+x/16)) test interval: 0.6250x10- 1 through 0.2679 MRE: 0.4333x10- 18 (61.0 bits) RMS: 0.1755x10- 18 (62.3 bits) 2 -DATAN(x) vs DATAN(2x/(l-x-x)) test interval: 0.2679 through 0.4142 MRE: O.6610x10- 18 (60.4 bits) 18 RMS: 0.1987x10(62.1 bits) 2-DATAN(x) vs DATAN(2x/(1-x-x)) test interval: 0.4142 through 1.0000 MRE: 0.4319x10- 18 (61.0 bits) RMS: 0.1167x10- 18 (62.9 bits) DCOS(x) vs 4-DCOS(x/3)3_3-DCOS(x/3) test interval: 0.2199x10 2 through 0.2356x10 2 MRE: 0.6523x10- 18 (60.4 bits) RMS: 0.1960x10- 18 (62.2 bits) DCOSH(x) vs Taylor Series expansion of DCOSH(x) test interval: 0.0000 through 0.5000 MRE: 0.4337x10- 18 (61.0 bits) RMS: 0.1550x10- 18 (62.5 bits) ELEFUNT Test Results A-3 DCOSH(x) vs C·(DCOSH(x+l)+DCOSH(x-·l) test interval: 3.0000 through 0.8803xl0 2 MRE: 0.8440xlO- 18 (60.0 bits) RMS: 0.2805x10- 18 (61.6 bits) DCOT(x) vs (DCOT(x/2)2-1)/(2·DCOT(x/2» test interval: 0.1885xl0 2 through 0.1963xl0 2 MRE: 0.9064xlO- 18 (59.9 bits) RMS: 0.2632xlO'-.18 (61. 7 bits) DEXP(x-0.0625) vs DEXP(x)/DEXP(0.0625) test interval: -0.2841 through 0.3466 MRE: 0.4336xlO-- 18 (61.0 bits) 18 RMS: 0.1689xlO(62.4 bits) DEXP(x-2.8125) vs DEXP(x)/DEXP(2.8125) test interval: -3.4660 through -0.4505xl0 2 MRE: 0.6394xlO- 18 (60.4 bits) RMS: 0.1670x10- 18 (62.4 bits) DEXP(x-2.8125) vs DEXP(x)/DEXP(2.8125) test interval: -6.9310 through 0.8792xl0 2 MRE: 0.6350x10- 18 (6004 bits) RMS: 0.1808xlO- 18 (62.3 bits) DEXP3. (x 1.0 vs x) test interval: 0.5000 through 1.0000 The result is exact. DEXP3. (XSQ1.5 vs XSQ ·x) test interval: 0.5000 through 1.0000 MRE: 0.4336xlO- 18 (61.0 bits) RMS: O.1585xl0 18 (62.4 bits) DEXP3. (XSQ1.5 vs XSQ ·x) test interval: 1.0000 through 0.5541x10 13 MRE: 0.4330xlO- 18 (61.0 bits) R1\1S: 0.1678xlO- 18 (62.4 bits) DEXP3. (x Y vs XSQy/2) test interval: 0.1000xlO- 1 through 0.1000xl0 2 for x -O.1942xl02 through 0.1942xl0 2 for y ]\tIRE: 0.5499xIO- 18 (60.7 bits)' RMS: 0.1196xlO-- 18 (62.9 hits) DLOG(x) vs Taylor Series expansion of DLOG(1+y) test interval: 1-9537xlO-6 through 1+9537xlO-6 l\1RE: O.5605xlO- 18 (60of) bits) Rl\tlS: 0.1922xl0 18 (62.2 bits) DLOG(x) vs DLOG(17x/16)--DLOG(17/16) test interval: 0.7071 through 0.9375 MHE: O.9228xlO- 18 (59.9 bits) RMS: O.3347xlO- 18 (61.4 bits) A-4 TOPS-10/TOPS-20 Common Math Library Reference Manual DLOG(x·x) vs 2·DLOG(x) test interval: 0.1600x10 2 through 0.2400xl03 MRE: 0.4306xlO- 18 (61.0 bits) RMS: 0.7895x10- 19 (63.5 bits) DLOG10(x) vs DLOG10(11x/10)-DLOG10(11/10) test interval: 0.3162 through 0.9000 MRE: 0.1476x10- 17 (59.2 bits) RMS: 0.3747x10- 18 (61.2 bits) DSIN(x) vs 3·DSIN(x/3)-4·nSIN(x/3)3 test interval: 0.0000 through 1.5710 MRE: 0.5378x10- 18 (60.7 bits) 18 RMS: 0.1802xlO(62.3 bits) DSIN(x) vs 3·DSIN(x/3)-4·DSIN(x/3)3 test interval: 0.1885x10 2 through 0.2042x10 2 MRE: 0.6115x10- 18 (60.5 bits) 18 RMS: 0.1960x10(62.2 bits) DSINH(x) vs Taylor Series expansion of DSINH(x) test interval: 0.0000 through 0.5000 MRE: 0.4336x10- 18 (61.0 bits) RMS: 0.8776x10- 19 (63.3 bits) DSINH(x) vs C·(DSINH(x+1)+DSINH(x-1)) test interval: 3.0000 through 0.8803x10 2 MRE: 0.8643x10- 18 ' (60.0 bits) RMS: 0.2736x10- 18 (61. 7 bits) DSQRT(x -x)-x test interval: 0.7071 through 1.0000 MRE: 0.3064x10- 18 (61.5 bits) 19 RMS: 0.7383x10(63.6 bits) DSQRT(x ·x)-x test interval: 1.0000 through 1.4140 The result is exact. DTAN(x) vs 2·TAN(x/2)/(l-DTAN(x/2)2) test interval: 0.1885x10 2 through 0.1963x10 2 MRE: 0.1262x10- 17 (59.5 bits) RMS: 0.3402x10- 18 (61.4 bits) DTAN(x) vs 2-DTAN(x/2)/(l-DTAN(x/2)2) test interval: 2.7490 through 3.5340 MRE: O.1216xlO- 17 (59.5 bits) RMS: O.2492xlO- 18 (61.8 bits) DTAN(x) vs 2-DTAN(x/2)/(l-DTAN(x/2)2) test interval: 0.0000 through 0.7854 MRE: 0.1094xlO- 17 (59.7 bits) RMS: 0.3331xlO- 18 (61.4 bits) ELEFUNT Test Results A-S DTANH(x) vs (DTANH(x-1/B) +DTANI-I(1/8) )/( 1+DTANH(x-1/8)DTANH(1/8» test interval: 0.1250 through 0.5493 MRE: 0.8436x10- 18 ·(60.0 bits) RMS: 0.2150x10· 18 (62.0 bits) DTANH(x) vs (DTANH(x-1/B) +DTANH(1/8) )/(1 +DTANH(x-1/8)DTANH(1/B» test interval: 0.6743 through 0.2253x10 2 MRE: 0.4952x10·· 18 (60.B bits) RMS: 0.1966x10- 18 (62.1 bits) EXP(x-0.0625) vs EXP(x)/EXP(0.0625) test interval: -0.2841 through 0.3466 MRE: O.1489xlO-7 (26.0 bits) RMS: 0.5BOlx10- 8 (27.4 bits) EXP(x-2.8125) vs EXP(x)/EXP(2.8125) test interval: -3.4660 through -0.6931xl02 MRE: 0.1489xlO- 7 (26.0 bits) RMS: 0.5879xl0--8 (27.3 bits) EXP(x-2.8125) vs EXP(x)/EXP(2.8125) test interval: 6.9310 through 0.8792xl0 2 MRE: O.2108xlO-7 (25.5 bits) RMS: 0.576BxlO-8 (27.4 bits) EXP3. (x LO vs x) test interval: 0.5000 through 1.0000 The result is exact. EXP3. (XSQ1.5 vs XSQ ·x) test interval: 0.5000 through 1.0000 MRE: 0.1487xlO- 7 (26.0 bits) RMS: 0.5433xl0 8 (27.5 bits) L~XP3. (XSQ1.5 vs XSQ ·x) test interval: 1.0000 through 0.5541xl0 13 MHE: O.1461xlO- 7 (26.0 bits) RMS: 0.5347x10- 8 (27.5 bits) EXP3. (x Y vs XSQy/2) test interval: 0.1.000xlO- 1 through 0.1000xl0 2 for x -0.1942xl02 through 0.1942xl0 2 for y MRE: 0.2065xl0 7 (25.5 bits) RMS: 0.3572xlO-8 (28.0 bits) GACOS(x) vs Taylor Series test interval: -1.0000 through -0.7500 MRE: 0.2869xlO-· 17 (58.3 hits) H1\1S: 0.1515xlO- 17 (59.2 bits) GACOS(x) vs Taylor Series test interval: 0.7500 through 1.0000 MRE: O.3443xlO- 17 (58.0 hits) 18 HMS: 0.4924xlO· (60.B bits) A-6 TOPS- -, OlTOPS-20 Common Math Library Reference Manual GACOS(x) vs Taylor Series test interval: -0.1250 through 0.1250 MRE: 0.2399x10- 17 (58.5 bits) RMS: O.1297x10- 17 (59.4 bits) GASIN(x) vs Taylor Series test interval: 0.7500 through 1.0000 MRE: 0.3457x10- 17 (58.0 bits) RMS: 0.1452x10- 17 (59.3 bits) GASIN(x) vs Taylor Series test interval: -0.1250 through 0.1250 MRE: 0.3462x10- 17 (58.0 bits) RMS: 0.4997x10- 18 (60.8 bits) GATAN(x) vs truncated Taylor Series test interval: -O.6250x10- 1 through 0.6250x10- 1 MRE: 0.3389x10- 17 (58.0 bits) 18 RMS: 0.3674x10(61.2 bits) GATAN(x) vs GATAN(1/16)+GATAN«x-1/16)/(1+x/16)) test interval: 0.6250x10- 1 through 0.2679 MRE: 0.3899x10- 17 (57.8 bits) RMS: 0.1436x10- 17 (59.3 bits) 2 ·GATAN(x) vs GATAN(2x/(1-x·x)) test interval: 0.2679 through 0.4142 MRE: 0.3308x10- 17 (58.1 bits) RMS: 0.1601x10- 17 (59.1 bits) 2 ·GATAN(x) vs GATAN(2x/(1-x·x)) test interval: 0.4142 through 1.0000 MRE: 0.4360x10- 17 (57.7 bits) 18 RMS: 0.9839x10(59.8 bits) GCOS(x) vs 4·GCOS(x/3)3_3·GCOS(x/3) test interval: 0.2199x10 2 through 0.2356xl0 2 MRE: 0.4779x10- 17 (57.5 bits) RMS: 0.1515x10- 17 (59.2 bits) GCOSH(x) vs C·(GCOSH(x+1)+GCOSH(x-1)) test interval: 3.0000 through 0.7091x103 MRE: 0.4770x10- 17 (57.5 bits) RMS: 0.1712x10- 17 (59.0 bits) GCOSH(x) vs Taylor Series expansion of GCOSH(x) test interval: 0.0000 through 0.5000 MRE: O.3469x10- 17 (58.0 bits) RMS: 0.1234x10- 17 (59.5 bits) GCOT(x) vs (GCOT(x/2)2_1)/(2·GCOT(x/2)) test interval: 0.1885xl0 2 through 0.1963x10 2 MRE: 0.7609x10- 17 (56.9 bits) RMS: 0.2096x10- 17 (58.7 bits) ELEFUNT Test Results A-7 GEXP(x-2.8125) vs GEXP(x)/GEXP(2.8125) test interval: 6.9310 through 0.7090x103 MRE: 0.4706x10-- 17 (57.6 bits) RMS: 0.1391x10- 17 (59.3 bits) GEXP(x-2.8125) vs GEXP(x)/GEXP(2.8125) test interval: -3.4660 through -0.6682x103 MRE: 0.4690x10- 17 (57.6 bits) RMS: 0.1395x10- 17 (59.3 bits) GEXP(x-0.0625) vs GEXP(x)/GEXP(0.0625) test interval: -0.2841 through 0.3466 MRE: 0.3469x10-- 17 (58.0 bits) 17 RMS: 0.1384x10(59.3 bits) GEXP3. (xl. O vs x) test interval: 0.5000 through 1.0000 The result is exact. GEXP3. (XSQ1.5 vs XSQ ·x) test interval: 0.5000 through 1.0000 MRE: 0.3464x10- 17 (58.0 bits) RMS: 0.1334x10-- 17 (59.4 bits) GEXP3. (XSQ1.5 vs XSQ ·x) test interval: 1.0000 through 0.4479x10 103 l\1RE: 0.3464x10- 17 (58.0 bits) 17 RMS: 0.1347x10(59.4 bits) GEXP3. (xY vs XSQy/2) test interval: 1.0000 through 0.1000x10 2 for x -O.1543x103 through 0.1543x103 for y MRE: 0.3371xlO·· 16 (54.7 bits) RMS: 0.4759x10- 17 (57.5 bits) GLOG(x) vs Taylor Series expansion of GLOG(1+y) test interval: 1-0.1907xlO-5 through 1+0.1907x1o-5 MRE: 0.5771x10- 17 (57.3 bits) RMS: 0 ..1557xlO-· 17 (59,,2 bits) GLOG(x) vs GLOG(17x/16)-GLOG(17/16) test interval: 0.7071 through 0.9375 MRE: O.3501xlO- 17 (58.0 bits) RMS: 0.1488xlo-- 17 (59.2 bits) GLOG(x·x) vs 2·GLOG(x) test interval: 0.1600xl02 through 0.2400x103 MRE: O.:~393xl017 (58.0 bits) RMS: 0.4781xl0- 18 (60.9 bits) GLOGIO(x) vs GLOGI0(11x/lO)-GLOG10(11/10) test interval: 0.3162 through 0.9000 MRE: O.9112x10- 17 (56.6 bits) RMS: 0.2560xlO- 17 (58.4 bits) A-~8 TOPS-10/TOPS-20 Common Math Library Reference Manual GSIN(x) vs 3·GSIN(x/3)-4-GSIN(x/3P test interval: 0.0000 through 1.5710 MRE: 0.3794xlO- 17 (57.9 bits) RMS: 0.1394xlO- 17 (59.3 bits) GSIN(x) vs 3·GSIN(x/3)-4·GSIN(x/3)3 test interval: 0.1885x10 2 through 0.2042x10 2 MRE: 0.5320x10- 17 (57.4 bits) RMS: 0.1719x10- 17 (59.0 bits) GSINH(x) vs C·(GSINH(x+1)+GSINH(x-1» test interval: 3.0000 through 0.7091x103 MRE: 0.5035x10- 17 (57.5 bits) 17 RMS: 0.1730x10(59.0 bits) GSINH(x) vs Taylor Series expansion of GSINH(x) test interval: 0.0000 through 0.5000 MRE: 0.3459x10- 17 (58.0 bits) RMS: 0.2973x10- 18 (61.5 bits) GSQRT(x ·x)-x test interval: 0.7071 through 1.0000 MRE: 0.2450x10- 17 (58.5 bits) 18 RMS: 0.6269x10(60.5 bits) GSQRT(x -x)-x test interval: 1.0000 through 1.4140 The result is exact. GTAN(x) vs 2·GTAN(x/2)/(1-GTAN(x/2)2) test interval: 2.7490 through 3.5340 MRE: 0.6827x10- 17 (57.0 bits) RMS: O.2028x10- 17 (58.8 bits) GTAN(x) vs 2-GTAN(x/2)/(l-GTAN(x/2)2) test interval: 0.1885x10 2 through 0.1963x102 MRE: 0.9834x10- 17 (56.5 bits) RMS: 0.2760x10- 17 (58.3 bits) GTAN(x) vs 2-GTAN(x/2)/(1-GTAN(x/2)2) test interval: 0.0000 through 0.7854 MRE: 0.9663x10- 17 (56.5 bits) RMS: O.2678x10- 17 (58.4 bits) GTANH(x) vs (GTANH(x-1/8)+GTANH(1/8»/(1+GTANH(x-l/8)GTANH(1/8» test interval: 0.1250 through 0.5493 (57.6 bits) MRE: 0.4684xlO- 17 17 RMS: O.1608xlO(59.1 bits) GTANH(x) vs (GTANH(x-1/8) +GTANH(1/8) )/(l+GTANH(x-1/8)GTANH(1/8» test interval: 0.6743 through 2149x10 2 MRE: O.3750xlO-- t7 (57.9 bits) 17 RMS: 0.1621x10(59.1 bits) ELEFUNT Test Results A-9 SIN(x) vs 3 SIN(x/3)-4-SIN(x/~))3 test interval: 0.0000 through 1.5710 MRE: 0.1934x10- 7 (25.6 bits) RMS: 0.5980x1o--s (27.3 bits) SIN(x) vs 3-SIN(x/3)-4-SIN(x/3)3 test interval: 0.1885x10 2 through 0.2042x10 2 MRE: 0.2736x10- 7 (25.1 bits) RMS: O.6923xlO-8 (27.1 bits) SINH(x) vs C -(SINH(x+1)+SINH(x-1» test interval: 3.0000 through 0.8803x10 2 MRE: O.3020x10- 7 (25.0 bits) RMS: 0.7083x10-8 (27.1 bits) SINH(x) vs Taylor Series expansion of SINH(x) test interval: 0.0000 through 0.5000 MRE: 0.1479x10-- 7 (26.0 bits) RMS: 0.1143xlO-s (29.7 bits) SQRT(x -x)-x test interval: 0.7071 through 1.0000 The result is exact. SQRT(x ·x)-x test interval: 1.0000 through 1.4140 The result is exact. TAN(x) vs 2·TAN(x/2)/(l-TAN(x/2)2) test interval: O.1885xl0 2 through 0.1963xl0 2 MRE: O,3059xlO- 7 (25.0 bits) RMS: 0.1039xlO-7 (26,5 bits) TAN (x) vs 2-TAN(x/2)/(1-TAN(x/2)2) test interval: 2.7490 through 3.5340 MRE: O.2940x10-7 (25.0 bits) RMS: 0.7439x10-- s (27.0 hits) TAN(x) vs 2-TAN(x/2)/(l--TAN(x/2)2) test interval: 0.0000 through 0.7854 MRE: O.2994xlo--7 (25.0 bits) 7 Rl\IS: 0.1074x10-(26.5 bits) TANH(x) vs (TANH(x--l/8)+TANH(1/8»/(1+TANH(x-1/8)TANH(1/B» test interval: 0.1250 through 0.5493 MRE: 0.2020x10-7 (25.6 bits) RMS: O.6944x10-8 (27.1 bits) TANH(x) vs (TANH(x-l/8)+TANH(1/8»/(1+TANH(x-1/B)TANH(1/B» test interval: 0.6743 through O.1040xl0 2 IV1RE: O.2156xlO- 7 (25.5 bits) HMS: O.6360xlo-- s (27.2 bits) Appendix B Using the Common Math Library with MACRO Programs The Math Library was designed to be used mainly by compiler-level languages. The object-time systems of such languages have facilities to handle error conditions that may occur when a routine from the Math Library is executed. MACRO programmers must include such facilities in their programs. There are two facilities necessary for use of the Math Library: a trap handler and an error handler. The trap handler is needed, since under certain circumstances the Math Library executes floating-point instructions which may overflow or underflow. In these cases, the library routines expect that the result will be set to the largest possible number for floating overflow, or set to zero for underflow. The central processor does not set the results - the overflows and underflows must be detected by the APR trapping system and interpreted by the trap handler. If the overflow/underflow settings are not done properly, the math routine in question will very likely return mathematically incorrect results. The error handler is a general error printout routine. It is called by the Math Library when the arguments passed to a Math Library routine are out of range or otherwise incorrect. Provided with the Math Library are modules for handling APR traps and properly setting the results (MTHTRP) and for providing error handling and reporting (MTHDUM). A MACRO program must initialize these modules before using any other components of the Math Library, as follows: PUSHJ PUSHJ P,%TRPIN## P,%ERINI## ;INITIALIZE TRAP HANDLER ;INITIALIZE ERROR HANDLER 8-1 Index A ABS routine, 9-4 Absolute value complex, 9-7 double-precision D-floating-point, 9-8 . double-precision G-floating-point, 9-9 double-precision, D-floating-point, 9-5 G-floating-point, 9-6 integer, 9-3 single-precision, 9-4 Accuracy tests, 1-14 ACOS routine, 6-4 AIMAG routine, 15-4 AINT routine, 11-9 ALOG routine, 3-3 ALOG10 routine, 3-5 AMAXO routine, 14-5 AMAX1 routine, 14-6 AMINO routine, 14-11 AMINI routine, 14-12 AMOD routine, 12-6 ANINT routine, 11-6 Arc cosine double-precision, D-floating-point,6-7 G-floating-point, 6-11 single-precision, 6-4 Arc sine double-precision, D-floating-point, 6-5 G-floating-point, 6-9 single-precision, 6-3 Arc tangent double-precision, D-floating-point, 6-17 G-floating-point, 6-21 single-precision, 6-13 ASIN routine, 6-3 ATAN routine, 6-13 ATAN2 routine, 6-15 Average relative error, 1-14 B Base-10 logarithm, double-precision, D-floating-point, 3-9 G-floating-point, 3-13 single-precision, 3-5 c CABS routine, 9-7 Calling sequence, 1-13 CCOS routine, 5-21 CDABS routine, 9-8 CDCOS routine, 5-25 CDEXP routine, 4-11 CDLOG routine, 3-17 CDSIN routine, 5-23 CDSQRT routine, 2-11 CEXP routine, 4-9 CEXP2. routine, 4-22 CEXP3. routine, 4-34 CFDV routine, 15-7 CFM routine, 15-6 CGABS routine, 9-9 CGCOS routine, 5-29 CGEXP routine, 4-13 CGLOG routine, 3-19 CGSIN routine, 5-27 CGSQRT routine, 2-13 CLOG routine, 3-15 CMPL.C routine, 10--23 CMPL.D routine, 10--21 CMPL.G routine, 10--22 Index-1 CMPL.I routine, 10-19 CMPLX routine, 10-20 Cody, W. J., 1-15, A-I Cody and Waite, Software Manual for Elementary Functions, 5-32, 5-34, 5-36,5-38,5-40 Complex, absolute value, 9-7 conjugate, 15-5 conversion, complex to complex, 10-23 cosine, 5-21 data types, 1-12 division, 15-7 double-precision D-floating-point, 1-12 absolute value, 9-8 cosine, 5-25 exponential, 4-11 natural logarithm, 3-17 sine, 5-23 square root, 2-11 double-precision G-floating-point, 1-12 absolute value, 9-9 cosine, 5-29 exponential, 4-13 natural logarithm, 3-19 sine, 5-27 square root, 2-13 exponential, 4-9 exponentiation, complex to complex, 4-34 complex to integer, 4-22 multiplication, 15-6 natural logarithm, 3-15 number, imaginary part, 15-4 real part, 15-3 product, 15-6 quotient, 15-7 sine, 5-19 square root, 2-9 Computer Approximations, Hart et.al., 3-4, 3-6, 6-14, 6-18, 6-22 CONJ routine, 15-5 Conjugate complex, 15-5 Conversion complex to complex, 10-23 double-precision, D-floating-point to complex, 10-20 D-floating-point to G-floating-point, 10-17, 10-18 D-floating-point to integer, 10-5 2-lndex Conversion (Cont.) D-floating-point to single-precision, 10-9 G~floating-point to complex, 10-22 G-floating-point to D-floating-point, 10-13, 10-14 G-floating-point to integer, 10-6 G-floating-point to single-precision, 10-10 integer, to complex, 10-19, to double-precision D-floating-point, 10-11 to double-precision G-floating-point, 10-15 to single-precision, 10-7, 10-8 single-precision, to complex, 10-20 to double-precision D-floating-point, 10-12 to double-precision G-floating-point, 10-16 to integer, 10-3, 10-4 COS routine, 5-7 COSD routine, 5-9 COSH routine, 7-4 Cosine, complex, 5-21 double-precision D-floating-point, 5-25 double-precision G-floating-point, 5-29 double-precision, D-floating-point, 5-13 G-floating-point, 5-17 single-precision, 5-7, 5-9 COTAN routine, 5-33 Cotangent, double-precision, D-floating-point, 5-37 G-floating-point, 5--41 single-precision, 5-33 Coveyan, R. R. and MacPherson, R. D., Journal of the ACM, #14, 8-4 CSIN routine, 5-19 CSQRT routine, 2-9 D DABS routine, 9-5 DACOS routine, 6-7 DASIN routine, 6-5 DATAN routine, 6-17 DATAN2 routine, 6-19 Data types, 1-10 complex, 1-12 double-precision, D-floating-point, 1-11 G-floating-point, 1-11 integer, 1-10 single-precision, 1-10 DBLE routine, 10-12 DCOS routine, 5-13 DCOSH routine, 7-7 DCOTAN routine, 5-37 DDIM routine, 12-11 DEXP routine, 4-5 DEXP2. routine, 4-18 DEXP3. routine, 4-28 DFLOAT routine, 10-11 D-floating-point, absolute value, 9-5 arc cosine, 6-7 arc sine, 6-5 arc tangent, 6-17 base-l0 logarithm, 3-9 conversion, to complex, 10-21 to G-floating-point, 10-17, 10-18 to integer, 10-5 to single-precision, 10-9 cosine, 5-13 cotangent, 5-37 data type, 1-11 exponential, 4-5 exponentiation, to D-floating-point, 4-28 to integer, 4-18 hyperbolic cosine, 7-7 hyperbolic sine, 7-5 hyperbolic tangent, 7-12 maximum of a series, 14-7 minimum of a series, 14-13 natural logarithm, 3-7 polar angle of two points, 6-19 positive difference, 12-11 product, 12-3 remainder, 12-7 rounding, to D-floating-point, 11-7 to integer, 11-4 sine, 5-11 square root, 2-5 tangent, 5-35 transfer of sign, 13-5 truncation, 11-10 DIM routine, 12-10 DINT routine, 11-10 Division, complex, 15-7 DLOG routine, 3-7 DLOGI0 routine, 3-9 DMAXI routine, 14-7 DMINI routine, 14-13 DMOD routine, 12-7 DNINT routine, 11-7 Double precision, data types, 1-11 D-floating-point, 1-11 absolute value, 9-5 arc cosine, 6-7 arc sine, 6-5 arc tangent, 6-17 base-l0 logarithm, 3-9 conversion, to complex, 10-21 to G-floating-point, 10-17, 10-18 to integer, 10-5 to single-precision, 10-9 cosine, 5-13 cotangent, 5-37 exponential, 4-5 exponentiation, to D-floating-point, 4-28 to integer, 4-18 hyperbolic cosine, 7-7 hyperbolic sine, 7-5 hyperbolic tangent, 7-12 maximum of a series, 14-7 minimum of a series, 14-13 natural logarithm, 3-7 polar angle of two points, 6-19 positive difference, 12-11 product, 12-3 remainder, 12-7 rounding, to D-floating-point, 11-7 to integer, 11-4 sine, 5-11 square root, 2-5 tangent, 5-35 transfer of sign, 13-5 truncation, 11-10 G-floating-point, 1-11 absolute value, 9-6 arc cosine, 6-11 arc sine, 6-9 arc tangent, 6-21 base-l0 logarithm, 3-13 conversion, to complex, 10-22 Index-3 Double Precision (Cont.) to D-floating-point, 10-13, 10-14 to integer, 10-6 to single-precision, 10-10 cosine, 5-17 cotangent, 5-41 exponential, 4-7 exponentiation, to G-floating-point, 4-31 to integer, 4-20 hyperholic cosine, 7-10 hyperbolic sine, 7-8 hyperbolic tangent, 7-13 maximum of a series, 14-8 minimum of a series, 14-14 natural logarithm, 3-11 polar angle of two points, 6-23 positive difference, 12-12 product, 12-4 remainder, 12-8 rounding, to G-floating-point, 11-8 to integer, 11-5 sine, 5-15 square root, 2-7 tangent, 5-39 transfer of sign, 13-6 truncation, 11-11 DPROD routine, 12-3 DSIGN routine, 13-5 DSIN routine, 5-11 DSINH routine, 7-5 DSQRT routine, 2-5 DTAN routine, 5-35 DTANH routine, 7-12 DTOG routine, 10-17 DTOGA routine, 10-18 E ELEFUNT tests, 1-15, A-I Entry points, 1-13 Error, maximum relative (MRE), 1-14 average relative (RMS), 1-14 EXP routine, 4-3 EXPI. routine, 4-15 EXP2. routine, 4-16 EXP3. routine, 4-25 Exponential, complex, 4-9 double-precision D-floating-point, 4-11 double-precision G-floating-point, 4-13 4--lndex Exponential (Cont.) double-precision, D-floating-point, 4-5 G-floating-point, 4-7 single-precision, 4-3 Exponentiation, complex to complex, 4-34 complex to integer, 4-22 D-floating-point to D-floating-point, 4-28 D-floating-point to integer, 4-18 G-floating-point to G-floating-point, 4-31 G-floating-point to integer, 4-20 integer to integer, 4-15 single-precision to integer, 4-16 single-precision to single-precision, 4-25 F FLOAT routine, 10-8 Functions, math library, 1-3 G GABS routine, 9-6 GACOS routine, 6-11 GASIN routine, 6-9 GATAN routine, 6-21 GATAN2 routine, 6-23 GCOS routine, 5-17 GCOSH routine, 7-10 GCOTAN routine, 5-41 GDB.n routine, 10-16 GDIM routine, 12-12 GEXP routine, 4-7 GEXP2. routine, 4-20 GEXP3. routine, 4-31 GFL.n routine, 10-15 G-floating-point, absolute value, 9-6 arc cosine, 6-11 arc sine, 6-9 arc tangent, 6-21 base-10 logarithm, 3-13 conversion, to complex, 10-22 to D-floating-point, 10-13, 10-14 to integer, 10-6 to single-precision, 10-10 cosine, 5-17 cotangent, 5-41 data type, 1-11 exponential, 4-7 G-floating-point (Cont.) exponentiation, to G-floating-point, 4-31 to integer, 4-20 hyperbolic cosine, 7-10 hyperbolic sine, 7-8 hyperbolic tangent, 7-13 maximum of a series, 14-8 minimum of a series, 14-14 natural logarithm, 3-11 polar angle of two points, 6-23 positive difference, 12-12 product, 12-4 remainder, 12-8 rounding, 11-8 to G-floating-point, 11-8 to integer, 11-5 sine, 5-15 square root, 2-7 tangent, 5-39 transfer of sign, 13-6 truncation, 11-11 GFX.n routine, 10-6 GINT. routine, 11-11 GLOG routine, 3-11 GLOG 10 routine, 3-13 GMAX1 routine, 14-8 GMIN1 routine, 14-14 GMOD routine, 12-8 GNINT. routine, 11-8 GPROD. routine, 12-4 GSIGN routine, 13-6 GSIN routine, 5-15 GSINH routine, 7-8 GSN.n routine, 10-10 GSQRT routine, 2-7 GTAN routine, 5-39 GTANH routine, 7-13 GTOD routine, 10-13 GTODA routine, 10-14 H Hart et.al., Computer Approximations, 3-4,3-6,6-14,6-18,6-22 Hyperbolic cosine, double-precision, D-floating-point, 7-7 G-floating-point, 7-10 single-precision, 7-4 Hyperbolic sine, double-precision, D-floating-point, 7-5 G-floating-point,7-8 Hyperbolic sine (Cont.) single-precision, 7-3 Hyperbolic tangent, double-precision, D-floating-point, 7-12 G-floating-point, 7-13 single-precision, 7-11 I lABS routine, 9-3 IDIM routine, 12-9 IDINT routine, 10-5 IDNINT routine, 11-4 IFIX routine, 10-3 IGNIN. routine, 11-5 Imaginary part of a complex number, 15-4 INT routine, 10-4 Integer, absolute value, 9-3 conversion, to complex, 10-19 to D-floating-point, 10-11 to G-floating-point, 10-15 to single-precision, 10-7, 10-8 data type, 1-10 exponentiation, 4-15 maximum, 14-3, 14-4 minimum, 14-9, 14-10 positive difference, 12-9 remainder, 12-5 transfer of sign, 13-3 ISIGN routine, 13-3 J Journal of the ACM, #14, Coveyan, R. R. and MacPherson, R. D., 8-4 K Knuth, D. E., Seminumerical Algorithms, 8-4 L Logarithm, see natural logarithm, base-10 logarithm LSB (least significant bit) error distribution, 1-15 M MACRO programs, using the math library with, B-1 Index-5 Math library, functions, 1-3 restrictions, 1-8 with MACRO programs, B-1 Mathematical names, 1-9 Mathematical symbols, 1-9 MAXO routine, 14-3 MAXI routine, 14-4 Maximum of a series, double-precision, D-floating-point, 14-7 G-floating-point, 14-8 integer, 14-3, 14-4 single-precision, 14-5, 14-6 Maximum relative error, 1-14 MINO routine, 14-9 MINI routine, 14-10 Minimum of a series, double-precision, D-floating-point, 14-13 G-floating-point, 14-14 integer, 14-9, 14-10 single-precision, 14-11, 14-12 MOD routine, 12-5 MRE (maximum relative error), 1-14 Multiplication, complex, 15-6 N Names, mathematical, 1-9 Natural logarithm complex, 3-15 double-precision D-floating-point, 3-17 double-precision G-floating-point, 3-19 double-precision, D-floating-point, 3-7 G-floating-point, 3-11 single-precision, 3-3 Newton-Raphson method, 2-4, 2-6, 2-8 NINT routine, 11-3 p Polar angle of two points, double-precision, D-floating-point, 6-19 G-floating-point, 6-23 single-precision, 6-15 Positi ve difference, double-precision, D-floating-point, 12--11 G-floating-point, 12-12 integer, 12-9 single-precision, 12-10 6-lndex Precision, 1-10 Product, complex, 15-6 double-precision, D-floating-point, 12-3 G-floating-point, 12-4 Q Quotient, complex, 15-7 R RAN routine, 8-3 Random number generator, 8-3 spectral test with, 8-3 with shuffiing, 8-5 Random number seed, saving, 8-7 setting, 8-6 RANS routine, 8-5 REAL routine, 10-7 REAL.C routine, 15-3 Real part of a complex number, 15-3 Register usage, 1-13 Relative error average (RMS), 1-1.4 maximum (MRE), 1-14 Remainder, double-precision, D-floating-point, 12-7 G-floating-point, 12-8 integer, 12-5 single-precision, 12-6 Restrictions, math library, 1-8 Return location, 1-13 RMS (root mean square), 1-14 Root mean square (RMS), 1-14 Rounding, double-precision, D-floating-point, to D-floating-point, 11-7 to integer, 11-4 G-floating-point, to G-floating-point, 11-8 to1nteger, 11-5 single-precision, to integer, 11-3 to single-precision, 11-6 s Saving random number seed, 8-7 SAVRAN routine, 8-7 Seminumerical algorithms, Knuth, D. E., 8-4 SETRAN routine, S--6 Setting random number seed, S--6 SIGN routine, 13-4 Sign, transfer, double-precision, D-floating-point, 13-5 G-floating-point, 13-6 integer, 13-3 single-precision, 13-4 SIN routine, 5--3 SIND routine, 5--5 Sine, complex, 5--19 double-precision D-floating-point, 5-23 double-precision G-floating-point, 5-27 double-precision, D-floating-point, 5-11 G-floating-point, 5-15 single-precision, 5-3, 5-5 Single-precision, absolute value, 9--4 arc cosine, 6-4 arc sine, 6-3 arc tangent, 6-13 base-10 logarithm, &-5 conversion, to complex, 10--20 to D-floating-point, 10--12 to G-floating-point, 10--16 to integer, 10--3, 10-4 cosine, 5-7, 5-9 cotangent, 5--33 data type, 1-10 exponential, 4-3 exponentiation, to integer, 4-16 to single-precision, 4-25 hyperbolic cosine, 7-4 hyperbolic sine, 7-3 hyperbolic tangent, 7-11 maximum of a series, 14-5, 14-6 minimum of a series, 14-11, 14-12 natural logarithm, 3-3 polar angle of two points, 6-15 positive difference, 12-10 remainder, 12-6 Single-precision (Cont.) rounding, to integer, 11-3 to single-precision, 11-6 sine, 5-3, 5-5 square root, 2-3 tangent, 5-31 transfer of sign, 13-4 truncation, 11-9 SINH routine, 7-3 SNGL routine, 10--9 Software Manual for Elementary Functions, Cody and Waite, 5--32, 5-34, 5-36, 5-38, 5-40 Spectral test with random number generator, 8-3 SQRT routine, 2-3 Square root, complex, 2-9 double-precision D-floating-point, 2-11 double-precision G-floating-point, 2-13 double-precision, D-floating-point, 2-5 G-floating-point, 2-7 single-precision, 2-3 Symbols, mathematical, 1-9 T TAN routine, 5-31 Tangent, double-precision, D-floating-point, 5-35 G-floating-point, 5-39 single-precision, 5-31 TANH routine, 7-11 Test interval, 1-14 Tests, accuracy, 1-14 Transfer of sign, double-precision, D-floating-point, 13-5 G-floating:.point, 13-6 integer, 13-3 single-precision, 13-4 Truncation, double-precision, D-floating-point, 11-10 G-floating-point, 11-11 single-precision, 11-9 Index-7 TOPS-10/TOPS-20 Common Math Library Reference Manual AA-M400A-TK READER'S COMMENTS NOTE: This form is for document comments only. DIGITAL will use comments submitted on this form at the company's discretion. If you require a written reply and are eligible to receive one under Software Performance Report (SPR) service, submit your comments on an SPR form. Did you find this manual understal1dable, usable, and well-organized? Please make suggestions for improvement. ' " ' Did you find errors in this manual? If so, specify the error and the page number. Please indicate the type of reader that you most nearly represent. o Assembly language programmer o Higher-level language programmer o Occasional programmer (experienced) o User with little programming experience o Student programmer o Other (please specify) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __ Name _______________________________________________________________ Oate ___________________ Organization Telephone _ _ _ _ _ _ _ __ Street __________________________________________________________________________________________________ City ________________________________________________________ State __________ Zip Code _ __ or Country f -111--------~~~:;;~~ -- O1d -----m-.- -aDOmNotaTear Fm . H_ and Ta~ - - - - - - - - - - - - - - - - - - - - - - ~ t- 1 if Mailed in the United States BUSINESS REPLY MAIL FIRST CLASS PERMIT NO. 33 MAYNARD MASS. POSTAGE WILL BE PAID BY ADDRESSEE SOFTWARE PUBLICATIONS 200 FOREST STREET MR01-2/L12 MARLBOROUGH, MA 01752 I I I .- - - - - - Do Not Tear - Fold Here and Tape _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ I1 I I I I I I f ,1 1 I I~ .. , I~ 1"8 ,:= I~ :r 1.2 1< '= IU , I I
Home
Privacy and Data
Site structure and layout ©2025 Majenko Technologies