Digital PDFs
Documents
Guest
Register
Log In
AA-MFO6A-TE
1988
640 pages
Original
147MB
view
download
OCR Version
89MB
view
download
Document:
ULTRIX-32 Supplementary Documents General User
Order Number:
AA-MFO6A-TE
Revision:
0
Pages:
640
Original Filename:
OCR Text
ULTRIX-32 Supplementary Documents General User Order No. AA-MFOGA-TE ULTRIX-32 Operating System, Version 3.0 Digital Equipment Corporation Copyright © 1984, 1988 by Digital Equipment Corporation. The information in this document is subject to change without notice and should not be construed as a commitment by Digital Equipment Corporation. Digital Equipment Corporation assumes no responsibility for any errors that may appear in this document. The software described in this document is furnished under a license and may be used or copied only in accordance with the terms of such license. No responsibility is assumed for the use or reliability of software on equipment that is not supplied by DIGITAL or its affiliated companies. The following are trademarks of Digital Equipment Corporation: DEC DECUS MASSBUS PDP ULTRIX-32 UNIBUS ULTRIX VT ULTRIX-11 VAX VMS dlilalilt[alI UNIX is a trademark of AT&T Bell Laboratories. Information herein is derived from copyrighted material as permitted under a license agreement with AT&T Bell Laboratories. This software and documentation is based in part on the Fourth Berkeley Software Distribution under license from the Regents of the University of California. We acknowledge the Electrical Engineering and Computer Science Departments at the Berkeley Campus of the University of California for their role in its development. i1 This software and documentation is based in part on the Fourth Berkeley Software Distribution under license from The Regents of the University of California. Digital Equipment Corporation acknowledges the following individuals and institutions for their role in its development: "The UNIX Time-Sharing System”: reprinted by permission. Copyright © 1974, Association for Computing Machinery, Inc. This is a revised version of an article that appeared in Communications of the ACM, 17, No. 7 (July 1974), pp. 365-375. That article was a revised version of a paper presnted at the Fourth ACM Symposium on Operating Systems Principles, IBM Thomas J. Watson Research Center, Yorktown Heights, New York, October 15-17, 1973. Acknowledgements: for their help and support, R.H. Canaday, R. Morris, M.D. Mcllroy, and J.F. Ossanna. ”Advanced Editing on UNIX"” acknowledgement: Ted Dolotta for his ideas and assistance. ?An Introduction to the UNIX Shell” acknowledgements: Dennis Ritchie, John Mashey and Joe Maran- zano for their help and support. "LEARN - Computer-Aided Instruction on UNIX” acknowledgements: for their help and support, M.E. Bittrich, J.L. Blue, S.I. Feldman, P.A. Fox, M.J. McAlpin, E.Z. Rothkopf, Don Jackowski, and Tom Plum. | ”A System for Typesetting Mathematics” acknowledgements: J.F. Ossanna, A.V. Aho, and S.C. Johnson, for their ideas and assistance. A TROFF Tutorial” acknowledgements: J. F. Ossanna, Jim Blinn, Ted Dolotta, Doug Mcllroy, Mike Lesk and Joel Sturman, for their help and support. The document "The C Programming Language - Reference Manual” is reprinted, with minor changes, from "The C Programming Language, by Brian W. Kernighan and Dennis M. Ritchie, Prentice-Hall, Inc., 1978. "Make - A Program for Maintaining Computer Programs” ackowledgements: S.C. Johnson, and H. Gajewska, for their ideas and assistance. "YACC: | Yet Another Compiler-Compiler” acknowledgements: B.W. Kernighan, P.J. Plauger, S.I. Feld- man, C. Imagna, M.E. Lesk, A. Snyder, C.B. Haley, D.M. Ritchie, M.O. Harris and Al Aho, for their ideas and assistance. | | "Lex - A Lexical Analyzer Generator” acknowledgements: S.C. Johnson, A.V. Aho, and Eric Schmidt, for their help as originators of much of Lex, as well as debuggers of it. The document "RATFOR - A Preprocessor for a Rational Fortran” is a revised and expanded version of the one published in Software - Practice and Experience, October 1975. the one in use on UNIX and GCOS at A T & T Bell Laboratories. The Ratfor described here is Acknowledgements: Dennis Ritchie, and Stuart Feldman, for their ideas and assistance. "The M4 Macro Processor” acknowledgements: Rick Becker, John Chambers, Doug Mcllroy, and Jim Weythman, for the help and support. "BC - An Arbitrary Precision Desk-Calculator Language” acknowledgement: The compiler is written in YACQC; its original version was written by S.C. Johnson. ?A Dial-Up Network of UNIX TM Systems” acknowledgements: G.L. Chesson, A.S. Cohen, J. Lions, and P.F. Long, for their suggestions and assistance. Copyright © 1979, 1980 Regents of the University of California. Permission to copy these documents or any portion thereof as necessary for licensed use of the software is granted to licenseesyof this software, provided this copyright notice and statement of permission are included. The document "Writing Tools - The STYLE and DICTION Programs” is copyrighted © 1979 by T Bell Laboratories. A T & Holders of a UNIX TM/32V software license are permitted to copy this document, or any portion of it, as necessary for licensed use of the software, provided this copyright notice and statement of permission are included. 1v The document "The Programming Language EFL” is copyrighted © 1979 by A T & T Bell Laboratories. EFL has been approved for general release, so that one may copy it subject only to the restriction of giving proper acknowledgement to A T & T Bell Laboratories. The documents A Portable Fortran 77 Compiler” and "Fsck - The UNIX File System Check Program” are modifications of earlier documents which are copyrighted © 1979 by A T & T Bell Laboratories. Holders of a UNIX TM/32V software license are permitted to copy these documents, or any portion of them, as necessary for licensed use of the software, provided this copyright notice and statement of per- mission are included. This manual reflects system enhancements made at Berkeley and sponsored in part by NSF Grants MCS-7807291, MCS-8005144, and MCS-74-07644-A04; DOE Contract DE-AT0376SF00034 and Project Agreement DE-AS03-79ER10358; and by Defense Advanced Research Projects Agency (DoD) ARPA Order No. 4031, monitored by Naval Electronics Systems Command under Contract No. N00039-80-K-0649. "Ex Reference Manual” acknowledgements: Chuck Haley contributed greatly to the early development of ex. Bruce Englar encouraged the redesign which led to ex version 1. Bill Joy wrote versions 1 and 2.0 through 2.7, and created the framework that users see in the present editor. Mark Horton added macros and other features and made the editor work on a large number of terminals and UNIX systems. ”A Guide to the Dungeons of Doom” acknowledgements: Rogue was originally conceived by Glenn Wichman and Michael Toy. Ken Arnold and Michael Toy then smoothed out the user interface, and added many new features. We would like to thank Bob Arnold, Michelle Busch, Andy Hatcher, Kipp Hickman, Mark Horton, Daniel Jensen, Bill Joy, Joe Kalash, Steve Maurer, Marty McNary, Jan Miller, and Scott Nelson for their ideas and assistance. The document *The FRANZ LISP Manual” is copyrighted © 1980, 1981, 1983 by the Regents of the University of California. (exceptions: Chapters 13, 14 (first half), 15 and 16 have separate copyrights, as indicated. These are reproduced by permission of the copyright holders.) Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, and the copyright notice of the Regents, University of California, is given. All rights reserved. Work reported herein was supported in part by the U.S. Department of Energy, Contract DE-AT03-76SF00034, Project Agreement DE-AS03-79ER10358, and the National Science Foundation under Grant No. MCS 7807291. MC68000 is a trademark of Motorola Semiconductor Products, Inc. "The FRANZ LISP Manual” acknowledgements: Richard Fateman, Mike Curry, John Breedlove, Jeff Levinsky, Bill Rowan, Tom London, Keith Sklower, Kipp Hickman, Charles Koester, Mitch Marcus, Don Cohen, John Foderaro, and Kevin Layer. The document ”"Berkeley Pascal User’s Manual” is copyrighted © 1977, 1979, 1980, 1983 by W.N. Joy, S.L. Graham, C.B. Haley, M.K. McKusick, P.B. Kessler. The financial support of the first and second authors’ work by the National Science Foundation under grants MCS74-07644-A04, MCS78-07291, and MCS80-05144, and the first author’s work by an IBM Graduate Fellowship are gratefully acknowledged. "Introduction to the f77 I/O Library” acknowledgement: Peter J. Weinberger originally wrote the 1/0/ Library at A T & T Bell Laboratories. "Writing Papers with NROFF Using -ME”, and ”-ME Reference Manual” acknowledgements: Bob Epstein, Bill Joy, Larry Rowe, Ricki Blau, Pamela Humphrey, and Jim Joyce, for their ideas and assistance. UNIX, NROFF, and TROFF are trademarks of A T & T Bell Laboratories. "Refer - A Bibliography System” acknowledgements: Mike Lesk of A T & T Bell Laboratories wrote the original refer software, including the indexing programs. Al Stanberger of the Forestry Department wrote the first version of addbib, then called bibin. Greg Shenaut of the Linguistics Department wrote the original versions of sortbib and roffbib. "Screen Updating and Cursor Movement Optimization: A Library Package” acknowledgements: For their help and support, Bill Joy, Doug Merritt, Kurt Shoens, Ken Abrams, Alan Char, Mark Horton, and Joe Kalash. "Disc Quotas in a UNIX Environment” acknowledgements: Sam Leffler and Kirk McKusick, for their work on the quota code. The current disc quota system is loosely based on a very early scheme imple- mented at the University of New South Wales and Syndey University. The document, "Fsck - The UNIX File System Check Program”, is a revision by Marshall Kirk McKusick; T.J. Kowalski wrote the original paper. For their help and support, we thank Bill Joy, Sam Leffler, Robert Elz, Dennis Ritchie, Robert Henry, Larry A. Wehr, and Rick B. Brandt. Our sponsors were the National Science Foundation under grant MCS80-05144, and the Defense Advance Research Projects Agency (DoD) under Arpa Order No. 4031 monitored by Naval Electronic System Command under Contract No. N00039-82-C-0235. ”A Fast File System for UNIX” acknowledgements: William N. Joy, Samuel J. LefHler, Robert S. Fabry, Marshall Kirk McKusick, Robert Elz, Michael Powell, Peter Kessler, Rober Henry, and Dennis Ritchie. This work was done under grants from the National Science Foundation under grant MCS80-05144, and the Defense Advance Research Projects Agency (DoD) under ARPA No. 4031 monitored by Naval Electronic System Command under Contract No. N00039-82-C-0235. ”4.2BSD Networking Implementation Notes” acknowledgements: patterned after the Xerox PUP architecture [Boggs79]. The internal structure of the system is The use of software interrupts for process invo- cation is based on similar facilities found in the VMS operating system. Many of the ideas are based on Rob Gurwitz’s TCP/IP implementation for the 4.1BSD version of UNIX on the VAX [Gurwitz81]. Greg Chesson explained his use of trailer encapsulations in Datakit, instigating their use in our system. "SENDMAIL - An Internetwork Mail Router” acknowledgements: For their ideas and assistance, Kurt Shoens, Bill Joy, Mark Horton, Erick Schmidt, Kirk McKusick, Marvin Solomon, Mike Stonebraker, and Bob Epstein. A considerable part of this work was done while under the employ of the INGRES Project at the University of California at Berkeley. Vil BEFORE YOU START This is the first volume of ULTRIX-32 Supplementary Documents, a three volume set that contains articles describing the ULTRIX-32 system. The authors are computer scientists and program developers at Bell Laboratories and the University of California at Berkeley. The articles explain the software tools and utilities available on your ULTRIX-32 system. They constitute most of the lore that enriches this operating system; topics range from getting started procedures to the details of screen updating and cursor movement facilities. Each volume in this set contains several parts, and each part begins with an introduction. Each introduction serves as a map that will help you find your way around in the documenta- tion, allowing you to select articles that relate to your interest. Each introduction gives an overview of the material covered in the part and a description of the articles included. Most readers will not need to read all articles in any part, since many articles cover parallel topics. For example, Part 3 in this first volume contains articles describing several text editors. should be able to choose one editor after reading the introduction; You then you can proceed to the relevant article. These articles provide authoritative and accurate information that is unavailable elsewhere. However, you should be aware that some of the information in some articles is dated. We include those articles because many of the concepts they develop are still current and important. At the end of each volume in this set, you will find a master index identifying topics for all three volumes. Topics in Volume I This first volume contains articles written for general use. You should find many of the arti- cles helpful no matter how you plan to use your ULTRIX-32 system. The two articles in Part 1 introduce the entire three-volume set; however, readers who are unfamiliar with operating systems and programming and readers new to the ULTRIX-32 and UNIX systems should begin with Part 2, Getting Started. The articles introduce basic concepts and demonstrate simple procedures. You will need to use a text editor if you plan to write (create or modify) files. Part 3, Text Editors, gives comprehensive information on five editors: ed, edit, vi, ex, and sed. Articles in Part 4, ULTRIX-32 system: Command Interpreters, introduce the two shells provided with the the Bourne Shell and the C Shell. Each shell serves as a set of handles that gives the user access to the ULTRIX-32 utilities. If you intend to use your ULTRIX-32 system to write and format any kind of document, you will find the articles on Document Preparation in Part 5 essential. formatting utilities. Nroff and troff are text In addition, the ULTRIX-32 software includes separate utilities that cooperate with the formatters to help you typeset mathematical expressions, set up tables, and create bibliographical references in your text. Part 6 includes articles that tell about a variety of unsupported software. Table of Contents ix BEFORE YOU START PART 1: OVERVIEW UNIX/32V - SUMMARY WHAT’S NEW: HIGHLIGHTS OF THE UNIX/32V SYSTEM . . . e e e e 1-4 . . . . L Operating System . . . . . . . . . . . User Access Control . . . . . . . . . . . e .. .. 1-3 SOFTWARE . e . . . e . . . . . o . . Basic Software . . . . . . . . . . . HARDWARE . e e 1-4 e e e e o e e e e . ..o e e e 1-4 e e e 1-4 e 1-5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . .00 0. [ 1-5 . . . . . . . .. . . .. e e Status Inquiries . . . . . . . . . . . . . . L L L L L . . . . . . . Accounting . . . Backup and Maintenance . . . . . . . . . Communication . . . . . . . . . . . e . . . e File Manipulation . Running of Programs . e e Terminal Handling Manipulation of Directories and File Names . e .o 1-5 . . . . e e . . oo e e e e e e e e e e e e . . . . . . . S . . . . . . . . Computer-Aided Instruction . . . . .. e e Languages . . . . . . The C Language. Fortran . . . . . . . . . . . . . . . . . Macroprocessing. . . . . . . . . . . . . . . Compiler-Compilers . . . . . . . . . . . . L . . . . . . . . Other Algorithmic Languages. Text Processing. . .« . . . . . . . e s e . . . . . . . . . . . .. . . . e e e e e e e e e e e e 1-6 e e e e 1-7 e e 1-9 e e IR . . e e e e e ... . ... 1-9 1-11 e e s e e .. 1-6 e e . ... e ... e 1-8 e . .. e e e e ..o . e e . e e Basic Program Development Tools . . . UNIX/32V Programmer’s Manual. . . . . . e e 1-11 e 1-11 e 1-12 .. 1-12 e 1-13 e 1-13 e e 1-13 e e 1-11 Document Preparation. . . . . . . . . . . . . .. . ..o 1-13 Document Formatting . . . . . . . . . . . . . .. . ..o 1-13 i e e e e e e e e e Information Handling. Graphics . . . . . . . . .e . e e e e e e e Novelties, Games, and Things That Didn’t Fit Anywhere Else . . e e e e e e e . . . . e e 1-15 1-16 . . . . .. 1-16 THE UNIX TIME-SHARING SYSTEM INTRODUCTION . e 1-19 HARDWARE AND SOFTWARE ENVIRONMENT . . . . . . . . . . . . . . .. . ... . . 1-20 THE FILE SYSTEM e 1-20 e e 1-20 . . . . . . .. e . . . . . . . Ordinary Files . . . . . . . . . . . . . . . Special Files . . . . . . . . . . . . . . Protection . . . . I/O Calls . . . . . . . . . e e e e e Directories . Removable File Systems e . . e . . . . . L . e s e e e e . . e e . . L e 1-21 e 1-21 oL 1-22 e e e e e e e e e e e e e e e e e e e 1-22 1-23 x Table of Contents THE UNIX TIME-SHARING SYSTEM (continued) IMPLEMENTATION OF THE FILE SYSTEM. PROCESSES AND IMAGES Processes. Pipes . . . . . . . THE SHELL . . . . . . . . . .. .. .. .... . S . . . . . . . . . . . . . . .. . . . . . . . . . . . .. e . .. e e e e e e e e e . . . . . . . . . . . . o o . o s, s . . . L L, . . . . . . . Implementation of the Shell. Initialization . . . . . . Other Programs as Shell o e e . Command Separators: Multitasking . . .., . . . . The Shell as a Command: Command Files. PART 2: . . Standard I/O. . . . oe, Termination . . . . Execution of Programs TRAPS . . . . Process Synchronization. Filters . . . . . . . . . . . . . . . . . . . . . ... ... . . . . . . . . . . . . . . . . . . . ... . . . . . . . . ... . . . . ..o, e e e . . . . . . . . . . . . . . ... e, GETTING STARTED UNIX FOR BEGINNERS - SECOND EDITION GETTING STARTED. . Logging In . . . Typing Commands . . . . . . . . . . . . . . . Strange Terminal Behavior . Mistakes in Typing . Read-Ahead . . . . Stopping a Program Logging Out . Mail . . . . . . . . . . . . . . s L . . . . . s . Lo . . . . . . . . . . . . . . . . .. .. ... . . . . . . . . . . . . . . . . . . . . . . .. L o Writing To Other Users. . . . . . . . . On-Line Manual . . . . . . . . . . . .. Printing Files. . ., .« . . . o s, . . . . . . . . . . . . . . . . .. What’s in a Filename. . . . . . . . . . . ... Using Files Instead of the Terminal . . . . . . The Shell . . . Shuffling Files About . Pipes . . . . . . . . . . . . . . . . ... ... . L e . . . . . . . s, Table of Contents DOCUMENT PREPARATION. . . . . . . . . . . . o o e e e s, xi 2-12 Formatting Packages . . . . . . . . L 2-12 Supporting Tools . . . . . . L L e e 2-13 Hints for Preparing Documents . . . .. 2-13 PROGRAMMING . . . . . . . . . . . o, P 2-14 The Shell . . . . . . . . . . ... 2-14 . 2-14 Programming in C . . . . Other Languages . . . . . UNIX READING LIST . . . . . . . . e e e . . . . . . . . . . . . . . . Programming the Shell . General . . . . . . . . . . . . . . . . . . . . . . . o o . . . ... oL L e e s 2-14 s 2-15 s 2-15 2-15 e o oo o e e e . o oL .. Document Preparation . . . . . . . oo e 2-16 Programming. . . . . . . L 2-16 . . . . . . . . . .. MAIL REFERENCE MANUAL INTRODUCTION . . . . COMMON USAGE . . . . e s s s 2-17 . . e s s s e e 2-18 MAINTAINING FOLDERS . . . . . . . . . MORE ABOUT SENDING MAIL . . . . . . Tilde Escapes . . . . . . . . . . . . . Special Recipients . . . . . . . . Network Access. Message Lists . . . . . . . . .. L . . . . . . . e Custom Options . . . . . . . . . . . . . . . . . . . . . . . . . . . e oo e e 2-23 o . 2-24 e 2-24 e e e e 2-26 2-27 e e e, e s e, o . . o . . o . . . . .. . ... oo . 2-28 o . . . s . 2-36 23T s e s e e . 2-28 2-33 SUMMARY OF COMMANDS, OPTIONS, AND ESCAPES. CONCLUSION . e e COMMAND LINE OPTIONS . . . e FORMAT OF MESSAGES GLOSSARY. . e . . . . L . . . L List of Commands . . o 2-38 . . . . .. L 2-39 s s s s s 2-41 e BC - AN ARBITRARY PRECISION DESK-CALCULATOR LANGUAGE INTRODUCTION . . . . . o . SIMPLE COMPUTATIONS WITH INTEGERS BASES . . e e e s e e e . . . . . . .. . e e ... e e .. e 2-43 .. 2-43 ... . o s s e e e 2-44 . . . . . 2-45 FUNCTIONS . . . o SCALING. . . o e e e SUBSCRIPTED VARIABLES . CONTROL STATEMENTS . SOME DETAILS . . . . . . . . . . . . . . . . . . . . . o . e e . . . o . . . . oo e e e e s 2-45 oo e e e . 2-46 oo . 2-47 e e 2-48 xii Table of Contents BC - AN ARBITRARY PRECISION DESK-CALCULATOR LANGUAGE (continued) APPENDIX . . . . e s s e e s e Notation . . . . . . . . . Tokens. . . . . . . . . . . . .. e ... e e e e e e e e e e 2-50 s e e 2-50 e, 2-50 Comments . . . . . . . . . Identifiers. . . . . . . . . . s e e, 2-50 Keywords . . . . . . . . e e e e e, 2-50 Constants . . . . . . . . . 2-50 . . .. . L Expressions . . Primitive Expressions . . e e e 2-50 . . . . . . ..o 2-51 . . . . . . . . . . . . e 2-51 . . . . . . . . ..o . 2-51 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Function-Name . . . . e . Length . . . . . Scale . . . . . L . . . . . 2-51 ... 2-51 . .. 2-51 . . . . . . . ... .. ... e .. e s s 2-51 2-51 2-51 s 2-51 e 2-51 s 2-51 . . . . . . . . . . . . . . . . . . Unary Operators. . . . . . . e Parentheses . e e e . e e . . s . . Constants e . . Sqrt. s, . Scale, Ibase and Obase. Function Calls . e . . Array-Name. . . Named Expressions. Identifiers. e e e 2-50 e . . . . . . . Exponentiation Operator. . . . . . . . . . . . . . . .. ..o 2-52 Multiplicative Operators . . . . . . . . . . . . . . . ..o 2-52 Additive Operators. e e e e 2-52 . . . . . . . . . L e 2-53 Assignment Operators . . . . . . . . . . L. 2-53 Relations. . . . Storage Classes. Statements. . . . o s . . . . . . .. e, 2-53 . . . . . . . . e, 2-53 . . . . L L L L e, 2-54 2-54 L e Expression Statements. . . . . . . . . . . . . ..o Compound Statements. . . . . . . . . . . . . e, 2-54 Quoted String Statements . . . . . . . . . . . e e 2-54 If Statements . . . e e e e e e e e . . . . . . . . L 2-54 While Statements . . . . . . . . . . . L 2-54 For Statements . . . . . . . . . . . .. Break Statements . . . . . . . . . . L Lo 2-54 Auto Statements. . . . . . . . L L L 2-55 Define Statements. . . . . . . . . . . . L e 2-55 Return Statements. . . . . . . . . . . . L e 2-55 Quit Statements. . . . . . L L L L, 2-55 . . . L ... e e e 2-54 Table of Contents xiii DC - AN INTERACTIVE DESK CALCULATOR SYNOPTIC DESCRIPTION . . . . . . . . . DETAILED DESCRIPTION . . . . . . . . . o . . Internal Representation of Numbers . . . . . . . . . . . . . . . . . . . . . . The Allocator. . . . . . . . Internal Arithmetic. . . . . Addition and Subtraction. . . Multiplication . . . . . Division . . . . . . . . . . Remainder. . . . . . . . . Square Root . . Exponentiation. . . . . . . . . . . . . . . .. . . L . . . . . . Output Commands . . . . . . . . Output Format and Base . . . . . . . . Internal Registers. . . . . . . . . . ... Stack Commands. . . . . . . . . L L . Subroutine Definitions and Calls . . . . . . e e . . . . . e . . . . . . . Push-Down Registers and Arrays . . . . . . . . . . . . . . . . . . . . . . . .. 2-57 e 2-59 e e s s e s . e 2-61 e e 2-61 e 2-62 e 2-62 o 2-62 e e e e e 2-62 .0 2-62 oo 2-63 e e . . . . . . . . . . . . ... o0 e e . e e s d e e d e e e e e 2-62 e e . e 2-61 2-61 e oo s 2-61 ..o e . PART 3: e, .00 P . . e e Internal Registers Programming DC . . e . e . . . . e L . . . s . . o e s . DESIGN CHOICES . . e . Miscellaneous Commands. e . . ... . ... 2-59 e e s e 2-59 e e e - 2-60 oo 2-60 . . o Input Conversion and Base . . e e s . . e o e 2-62 e 2-63 e 2-63 s, TEXT EDITORS EDIT: A TUTORIAL INTRODUCTION . . . . SESSION 1. . . e e e e . 3-5 . . . . . Making Contact with UNIX. . . . . . . Directly Linked Terminals . Dial-Up Terminals. Logging In . Asking for Edit. . . . . . . . . . . . . . . . . e e 3-5 . . . . . . . . . L Text Input Mode . . . . ..o 3-5 . . . . . . . . . . . . Messages from Edit. . . The “Command not found” Message . Entering Text . 3-3 . . . . e . . A Summary . . e . .. ... e . s . . . . .. . . . . . . . L . . . . . . . . . . . e e . . . e . e L 0L e e 3-5 e 3-5 . . . . oo 3-6 e e e e s 3-6 e . e 3-5 e e e 3-6 e 3-6 . . . . . . . . . . . . . . . . . . L Writing TexttoDisk . . . . . . . . . . . Signing Off. . L e 3-8 Making Corrections. . . . . . e . . . .. e e ... e e e e .. e e e e e 3-7 e 3-7 e 3-8 xiv Table of Contents EDIT: A TUTORIAL (continued) SESSION 2. . . . . . 3-9 Adding More Text tothe File . Interrupt. . . . . . . . . . . . . . . . . .. e e e e e e e e e e 3-9 Lo 3-9 . . . . . ..., 3-9 Listing What’s in the Buffer. Making Corrections. . . . . . . . . . . . . . . . . . . Finding Things in the Buffer . . . . . . . . . . . . . . . . s, The Current Line. . . . . . . . . . . . . Numbering Lines . . . . . . . . . . . . . Substitute Command . . . . . . . . L L L . . . . ... 3-10 . 3-10 .. . . ... ... s 3-11 3-11 L, 3-11 . . . . ... 3-12 . . . . .. 3-13 s, 3-14 Saving the Modified Text . . . . Another Way To List What’s in the Buffer. SESSION 3. .. . . . Bringing Text into the Buffer . Moving Text in the Buffer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . ..., 3-14 s, 3-15 3-15 Copying Lines . . . . . . . . . Deleting Lines . . . . . ... 3-14 . . . . . s, A Word or Two of Caution . . . . . . . . . . .. 3-16 Undo to the Rescue. . . . . . . . . . ., 3-16 Moving Around in the Buffer . . . . . . . . . ‘Changing Lines. . .. SESSION 4. . . . . . . ..., 3-17 3-18 s, 3-19 . . . . . Making Commands Global . . . . . . . ..o 3-19 . . . . . . . 3-20 Special Characters .. . . .. . ... 3-20 . . . . . . . ... 3-21 Filenames and File Manipulation . . . . . . . . . . . . ... 3-21 . . . . . . ... . . . . . . . . .. 3-21 . . . . . . . . . . ..o, 3-22 . . . . . . . . . . . . . . ... e 3-22 3-22 . . .. e FURTHER READING AND OTHER INFORMATION . . . . . . . . . . . . . . . . . . . . Other Recovery Techniques Using Ex. . . Writing Parts of the Buffer. . . . . . . . . . Recovering Files. . . . . . . ... S Reading Additional Files. . . . . . . . . . . Issuing UNIX Commands from the Editor . The File Command . . ... . . . . More about Searching and Substituting . . . . . .00 . . . . . . . . . . . . e . . . e e e e . . . .. . e 3-22 ... 3-23 . 3-23 A TUTORIAL INTRODUCTION TO THE UNIX TEXT EDITOR INTRODUCTION . . . . . o . o o DISCLAIMER. . . . . . . . o oe GETTING STARTED . . . . . . . . . . e e e e, 3-25 e e e s 3-25 CREATING TEXT - THE APPEND COMMAND “A” . . . . . . . . . . . . ... ... 3-25 ERROR MESSAGES - “?” . . . . . . . . . . . e o e e e WRITING TEXT OUT AS A FILE - THE WRITE COMMAND “W” . . . . LEAVING ED - THE QUIT COMMAND “Q” . . . . . . . . . . . . .. .. 3-25 3-26 3-26 . . . . ... ... READING TEXT FROM A FILE - THE EDIT COMMAND “E” . . . . . . . .. .. .. 3-27 READING TEXT FROM A FILE - THE READ COMMAND “R”. . . . . . . .. . ... 3-27 .. 3-28 PRINTING THE CONTENTS OF THE BUFFER - THE PRINT COMMAND “P” THE CURRENT LINE - “DOT” OR “.”. . . DELETING LINES - THE “D” COMMAND. . . . . . . . . . . . . . . . . MODIFYING TEXT - THE SUBSTITUTE COMMAND “S”. CONTEXT SEARCHING - “/../7 © . . . . o o . . . . . . ... . . . . . oo . 3-26 3-27 ... .... 3-29 . . . .. . ... 3-29 e e e e e 3-30 Table of Contents xv e e ... .. oo o e e . . . ... 3-31 3-32 3-32 3-33 3-34 e e s e s s e s e e ee o o o e 3-37 3-37 e s e e e e e e e e e . . . . oo P e e e o v e e e e e e e e e e e e e e e e e o e 3-37 3-37 3-38 3—-38 e e e e e e 3-39 3-39 3-40 3—-40 e e 3—-41 e e e e e 3-42 CHANGE AND INSERT - “C” AND “I”. . . . . . . . o o o oo oo e MOVING TEXT AROUND - THE “M” COMMAND. . . . . . .. . . . . . oo o . THE GLOBAL COMMANDS - “G” AND “V” . . . . . . . . e e e e e o SPECIAL CHARACTERS . . . . . . . . . o o . . . . . . . . . . . . SUMMARY OF COMMANDS AND LINE NUMBERS ADVANCED EDITING ON UNIX e e INTRODUCTION . . . o SPECIAL CHARACTERS . . . . . . . . . . o o The List Command . . . . . The Substitute Command. . The Undo Command . . . . The Metacharacter . . . . . The Backslash. . The Dollar Sign . The Circumflex . The Star . . . . . . . . . . . . . . . o The Brackets . . . . . . . . . . . . . . . . . . . . . . . . . e . . . . .« v . o e . . . . . e . . . . . . . o e e e e . . . . o e e e o . . . . . . . . . . . . . e e e e e e e e e e e e e e e e e The Ampersand . . . . . . . . . . . ..o Substituting Newlines e e e e e e e e e e 3—-42 oL e e e e e L oo Joining Lines. . . . . . . . . . L Rearranging a Line with ( ... ). . . . . . . . . . . . . ..o 3-42 3-43 e 3-43 e e e e e Address Arithmetic . . . . . . . . . e o oL L . Repeated Searches . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . Default Line Numbers and the Valueof Dot e e e e e e e e e e Semicolon . . . . . 3-43 3-44 3—-44 3—-45 LINE ADDRESSING IN THE EDITOR . . . . . . . o o o o o o o o o e e e Interrupting the Editor . . . . . . . . . . . ..o s e e e . . . . . e s e s GLOBAL COMMANDS . . . . . 3—-46 . . . . . . . .e 3-47 Multi-Line Global Commands. CUT AND PASTE WITH UNIX COMMANDS. . . . . . . . 346 oot tee. 3—47 oo o000 . o Changing the Nameof a File . . . . . . . . . . . . . . . oo oL . . . . . . . . . . . File Copyofa a Making e e e Removinga File . . . . . . . . . . . .. . ... e 3—47 3-47 3—48 e v e 3—-49 e e e e e e Filenames . . . . . . . . e Inserting One File into Another . . . . . . . . . . . . . . . o000 oo oo Writing Out Part of a File . . . . . . . . . . . . . . e Moving Lines Around. . . . . . . . . . . Lo oo e e e e e e e e e Marks . . . . e e e e e Copying Lines . . . . . . . . . . .. o e e e e oo o o The Temporary Escape . . . . . . . . . o o 3-49 3—-49 3-49 3-50 3-50 3-51 3-51 .. e 3-51 e e e e e e e e e GIeD . . . . o e e e o Editing Scripts . . . . . . . . . e e e e e e e e e Sed . L L 3-51 3-51 3-H2 .. Putting Two or More Files Together . . . . . . . . . . . . . . . . . . Adding Something to the End ofa File . . . . . . . . . . .. . ... ... ... CUT AND PASTE WITH THE EDITOR. . . . . . . . . SUPPORTING TOOLS . . . . . . . . . . . . o . . . o« . o 0 e 3—-48 3—-48 xvi Table of Contents AN INTRODUCTION TO DISPLAY EDITING WITH VI GETTING STARTED . . . . . . . . . Specifying Terminal Type. . . . . ..., 3-53 Editing a File . . . o, 3-54 The Editor’s Copy: The Buffer . . . . . . . . . . . . ... 3-54 Notational Conventions . . . . . . . . . . . . . ..., 3-55 Arrow Keys . . . . . . . . . . . . . . . . . . . . 3-53 . . . . . . . L . . . . . . . . ... 3-55 Getting Out of the Editor. . . . . . . .. 3-55 e e e e 3-56 . . . . . . . . . . . . e . . . . MOVING AROUND IN THE FILE Scrolling and Paging . . . . . Searching, Goto, and Previous Context. Moving Around on the Screen. Moving within a Line. SUMMATY VIEW . . . . . . . v . . vt . . . . . . . . .. . e . . . 3-56 . . . . 3-56 ... . . . . . . . . . . . . .. ... . . . . . . . 0000 .. 3-57 . . . . . . ., 3-57 . ... . . . v, 3-58 o s, 3-58 MAKING SIMPLE CHANGES Inserting . . 3-55 Special Characters: ESC, CRand DEL . . . . . . . . . . . . . . e, .. L Making Small Corrections. . . . . . . . . . . . . . . . . . . .. 3-58 3-58 3-59 More Corrections: Operators. . . . . . . . 3-59 Operating on Lines . . . . . . . .., 3-60 Undoing . . . . . . . . . . . . . ... S 3-60 SUMMATY . . . v o v o v e e e, 3-60 . . . MOVING ABOUT; REARRANGING AND DUPLICATING TEXT Low Level Character Motions . . . . . . . . . . . . . . . Higher Level Text Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rearranging and Duplicating Text. SUMMATY . . v v o v v HIGH LEVEL COMMANDS e o e . . . . . . . . . . . . . . . . . . Marking and Returning . . . . . . . . . . . .. . . . . . . . . . . . . . . Editing on Slow Terminals . . . . . . SPECIAL TOPICS . . . . . . ... 3-61 . . ., 3-61 . . . . . . . . . . . . 3-61 . 3-62 .. . ... 3-63 . ADJUSTING THE SCREEN . . 3-63 Writing, Quitting, Editing New Files Escaping to a Shell . . . Options, Set, and Editor Startup Files. . . . . . . . . .. .. ... ... ... 3-63 3-63 . . . .. 3-64 . ... ... T 3-64 3-64 . . . . . . ..., 3-64 . . . . . . . 3-65 Recovering Lost Lines. . . . . . . . . . .. e Recovering Lost Files . . . . . . . . . . . Continuous Text Input . . . . . . . . . . . . . . . ... ... e e, 3-66 ... 3-66 . . . . . . . . .. 3-67 . . . . . . . . . . 3-67 Filtering Portions of the Buffer . . . . . . . . . . . . Commands for Editing LISP . . . . . . . . . . . . Features for Editing Programs. Macros. . . . . . .. I . . . ... 3-68 . . .., 3-68 Lo, 3-68 WORD ABBREVIATIONS. . . . . . . . . . . . ... ... 3-69 Table of Contents NITTY-GRITTY DETAILS . . . . . . . e 3-69 Line Representation in the Display . . . . . . . . . . . . . . .. ... Counts. . . . . . . e e e e e e e More File Manipulation Commands . . . . . . . . . . . . . . . . .. More about Searching for Strings . . . . . . . . . . . . . oL oo Lo o o More about Input Mode. . . . . . . . . . . ... oL Upper Case Only Terminals. . . . . . . . . . . . . . . . . Viand Ex . . . . . . e e e e e e Open Mode: Vi on Hardcopy Terminals and “Glass TTY’s”. . . . . . . . . . . . .. 3-69 3-70 3-70 3-71 3-72 3-73 3-73 3-73 APPENDIX: CHARACTER FUNCTIONS o . . . . . . . . . o . . e xvil . . . . . . o . 3-75 STARTING EX . . . . e e e e e e e e e FILE MANIPULATION . . . . . . e s e e e e e e e e 3-83 3-84 EX REFERENCE MANUAL Current File . . . . . . . Alternate File . . . . . . . L Filename Expansion . . . . . . . . Multiple Files and Named Buffers. . Read Only . . . . . . . . . EXCEPTIONAL CONDITIONS . . . . . . . e e e e e e e e e e e . . . .00 . . . . . . . . . . . . ... s e e e e o s e e e e e e e 3-84 3-84 3-84 3-84 3-85 3-85 Errors and Interrupts . . . . . . . . . . Lo L 3-85 Recovering from Hangups and Crashes. . . . . . . . . . . . . . . . . . ... Editing Modes . . . . . . . . . o COMMAND STRUCTURE . . . . . o Command Parameters . . . . . . . . . Command Variants . . . . . . . . . . Flags After Commands . . . . . . . . . Comments . . . . . . . Multiple Commands per Line . . . . . . Reporting Large Changes . . . . . . . . COMMAND ADDRESSING . . . . . . . . . e e e e e e 3-85 3-85 e 3-86 e e e e e e e e e e e . . e e e e e e e . . . . . . . . o000 . . . .« . . oo 3-86 3-86 3-86 3-86 3-86 3-86 L . . e 3-87 e o 3-87 3-87 COMMAND DESCRIPTIONS . . . . . . o s e e e e e e REGULAR EXPRESSIONS AND SUBSTITUTE REPLACEMENT PATTERNS. . . . . . 3-87 3-96 Addressing Primitives. . . . . . . . Combining Addressing Primitives . . s . . . L . . Regular Expressions . . . . . . . . . .. Magic and Nomagic . . . . . . . . . .. Basic Regular Expression Summary . . . . Combining Regular Expression Primitives . Substitute Replacement Patterns . . . . . OPTION DESCRIPTIONS. . LIMITATIONS . . . . . o . . . L . e Lo . . . . . . . . e e e e oo .o e e e .00 . . . . . . . . . .00 . . . . . . . . . . . . . . .. . ... . . . . . . . . .. oo s e e s e e 3-96 3-96 3-96 3-97 3-97 e s s s e 3-97 e e 3-101 xviii Table of Contents EX REFERENCE MANUAL (continued) EX CHANGES - VERSION 3.1 TO 3.5 Update to Ex Reference Manual. . . . . . . .e . . . . . Command Line Options . . Commands . . . . . . . Options . . . . . . L . . Environment Enquiries Vi Tutorial Update . . . . . . . . . . . . o e e . .. . . o, 3-102 e, 3-102 .. .. e e L e 3-102 s 3-102 s, 3-102 . . . . . . . . .. . . . 00 0L Lo 3-103 . . . . . . . . . . . . . . . . . . . .. .0 3-103 Change in Default Option Settings . . . . . Deleted Features. SED - . Vi Commands . . . . Macros . . L L A . . . . . . . . . o 3-103 . . . . . . . . . . . .. . . . . ... 3-103 s, 3-103 s 3-104 NONINTERACTIVE TEXT EDITOR OVERALL OPERATION . . . . . . . . Command-Line Flags . . . . . . . . s . . . Order of Application of Editing Commands L s s 3-105 . . . .o 3-106 . . . . . . . . . . . . . . . Examples . . . . . .e . . . . . . . ... 3-106 Lo 3-106 ADDRESSES: SELECTING LINES FOR EDITING Line-Number Addresses. s . . . . s . Pattern-Space . . s . . . . . . . . . . .. e . e e e e e e 3-106 . . . . . ... 3-107 . . . e e e . . . . . . . . . . . . . . . . . . . . . Lo 3-107 Number of Addresses . . . . . . . . . . . . Context Addresses FUNCTIONS . . . . . . . e e e e, 3-107 L 3-107 s s, 3-108 ‘Whole-Line Oriented Functions . Substitute Function. . . . . . . . . . . . . . . . . . . .., 3-110 . . . . . . . . . . . . ... 3-111 . . . . . . . . . . . . Input-Output Functions. Multiple Input-Line Functions Hold and Get Functions . . . . . . . . . . .. . ... ... ... ... 3-108 .. 3-112 . . . . . . . . . . . . .., 3-113 . . . . . . . . . . . . . MISCELLANEOUS FUNCTIONS . . . . . . . . . Flow-of-Control Functions. PART 4: . . . ... 3-113 s, 3-114 COMMAND INTERPRETERS AN INTRODUCTION TO THE UNIX SHELL INTRODUCTION . . . . . . Simple Commands . . . s . . . . . . . . . . . . . . . . Input Output Redirection . . . . . . . . . . . Pipelines and Filters Background Commands. . File Name Generation QUOLING s s s, 4-3 L, 4-3 . . 0L . .., 4-3 oL, 4-4 . . . . . . . . . . . ... . . . . . . . . . . . . L. 4-3 ., 4-4 . . . . L, 4-5 Prompting . . . . . . . . . . . . . . . . The Shell and Login . . . . . . . . . . . Summary . e . . . ... .. e ... . ... ... . 48 ., 4-6 e e e e 4-6 Table of Contents SHELL PROCEDURES . . . Control Flow — For . . Control Flow — Case . e . . . . . . . . . . e, 4-7 . . . . . . . . . . . . . . . L . . . . . . s s s s s xix 4-7 o 4-8 Here Documents . . . Shell Variables . . . . . . . . . L, 4-10 . . . . . . . . . . . . . s 4-12 Control Flow — While . . . . . . . . . . . . . o, 4-12 Control Flow — If . . . . . . . . s, 4-13 . . . . . The Test Command . Command Grouping Debugging Shell Procedures. The Man Command . s s 4-9 . . . . . . . . e, 4-14 . . . . . . . . . e ... 4-15 . . . . . s . . . . . . . . . s, 4-15 KEYWORD PARAMETERS . . . . . . . . s, 4-17 4-17 Parameter Transmission . . . . . . . . . . .. Parameter Substitution . . . . . . . . . . . L, 4-17 Command Substitution . . . . . . . . . L s 4-18 Evaluation and Quoting. . 4-19 . . . . . . . . Error Handling . . . . . . . . . . . . s 4-21 Fault Handling. . . . . . . . . . . . . e 4-21 Command Execution . . . . . . . . . . . s, 4-23 Invoking the Shell . . . . . . . . .. e 4-24 APPENDIX A: GRAMMAR . . . . . . s, 4-26 . . . . L. . Lo e APPENDIX B: METACHARACTERS AND RESERVED WORDS . . . . . . . . . . ... . .. 4-27 ... 4-30 ... 4-30 AN INTRODUCTION TO THE C SHELL TERMINAL USAGE OF THE SHELL . The Basic Notion of Commands. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Flag Arguments . . . . . . . . . . . L s, 4-31 Output to Files. . . . . . . . . . . . s 4-31 Metacharacters in the Shell . . . . . . . . . . . . . .. ... Input from Files: Pipelines . . . . . . . . . . . . . . . Filenames . . . . . . . . . . L Quotation . . . . . . L L L L . . . . . . Terminating Commands What Now? . . . . . . . . . . . Shell Variables . . . . . The Shell’s History List. Allases . . . . . . . . . . . . . . . . . . . . . 4-32 4-32 . . s . . e . . . e 4-35 4-38 . . . . . . . .. . . . . . ... 4-39 ... 4-39 . . . . . . . . . .. . . . . . . . . . . . . s, 4-43 . . . . .. .. ..o 4-44 . 4-45 4-40 4-41 s . . . . . . . . . . . . . . ... e Useful Built-In Commands . . . . . . . . . . . . . . . . . . . . . . 4-33 4-35 s Jobs: Background, Foreground, or Suspended Working Directories. ... I ..o L More Redirection: >> and >&. .. e, DETAILS ON THE SHELL FOR TERMINAL USERS . Shell Startup and Termination . . . . . . . . ... e e e 4-48 ..., 4-50 e . . e . ... Table of Contents xxi A GUIDE TO PREPARING DOCUMENTS WITH -M$S COMMANDS FORATM . . . . . . . . . o s . . . . . . AN INTERNAL MEMORANDUM. . . . . oo, 5-14 HEADINGS. .. ... S 5-14 o e . . . . . . . . . . . . . .. ... ... 5-14 . . . . . . . . A SIMPLE LIST . . . . . . . e 5-15 DISPLAYS . . . o e e 5-15 . . . . MULTIPLE INDENTS . . FOOTNOTES. KEEPS . . . . . . s . . . o . . . . . o .. e oo e e e s SOME REGISTERS YOU CAN CHANGE . . o 5-15 5-15 . . s e e e . . e 5-15 . . e e DOUBLE COLUMN. EQUATIONS . . . 5-13 A RELEASED PAPER WITH MATHEMATICS . . . . . . . . . e e e s e e . . .. .. 5-15 5-16 ... 5-16 TABLES . . . . e e 5-16 USAGE . . . . s e e e 5-16 A REVISED VERSION OF -MS WRITING PAPERS WITH NROFF USING -ME BASICS OF TEXT PROCESSING. BASIC REQUESTS . . . Paragraphs. . . . . . 5-22 . . . e e e . . . . . . . . . . . . o oo 5-22 . . . . e e s e e e 5-22 Headers and Footers . . . . . . 5-23 Double Spacing. . . . . . . e, . . . . . . . . e e e e e 5-23 . . e e e e e 5-23 e e e Page Layout . . . . . . . . Underlining . . . . . . . . e 5-25 o s e e e s 5-25 DISPLAYS . . . Major Quotes. Lists . . Keeps . . . . . . . ANNOTATIONS Footnotes . . . . . L e e e e e 5-25 5-25 . . . . e e e e e e e Fancier Displays . . e e e e . . . .. s 5-26 . . . . .. e 5-26 . e 5-28 5-28 . . . . . . . . . . . . . . . . . oo . . ... . . . . . e e e e e Delayed Text. . . . . . . . . Indexes . L . . . . FANCIER FEATURES . . . . . . More Paragraphs . . . . . . Section Headings . . . . . . Parts of the Basic Paper . Equations and Tables. . Two-Column Output . Defining Macros o e e e e e e s e e 5-28 5-28 s o e e e 5-29 . . . . L 5-29 . . . . . . . . . . . . . . . . . . . . . . . . . . . ..o 5-33 . . . . . . . . . . . e 5-35 . . . . . . . .. Annotations Inside Keeps . . . . . . . . . . e L .. e e e e e 5-31 ..o, 5-32 e e s e e e 5-35 5-35 TROFF AND THE PHOTOSETTER. . . . . . . . . . . .. .. .. .. ........ 536 . . Point Sizes. . . . e L e Fonts . . e e e e e, 5-36 . 5-38 . . . . o e e e e e e xxil Table of Contents -ME REFERENCE MANUAL PARAGRAPHING. . . . . . . e 5-40 SECTION HEADINGS . . . . . . . . . e, 5-40 HEADERS AND FOOTERS . . . . . . . . DISPLAYS . . . . . ANNOTATIONS . o . . e, 5-41 e e e e, 5-42 . . o COLUMNED OUTPUT . . . . . . . . FONTS AND SIZES ROFF SUPPORT . . . . . e s . . e s e s e s e e 5-43 5-43 . . . . . e 5-44 . . . . e e e e, 5-44 PREPROCESSOR SUPPORT . . . . .. 5-45 MISCELLANEOUS . . . . o 5-45 . . . . . . . . . . . . . PREDEFINED STRINGS . . . . . . . . STANDARD PAPERS. . . SPECIAL CHARACTERS AND MARKS. . . . . . . . . o s e e, o oo 5-45 . .. o e, 5-47 . . . . . . . . . . . . .. . . ... e . . .. . ... ... .... 5-47 e e e 5-56 NROFF/TROFF USER’S MANUAL GENERAL EXPLANATION. Form of Input . . . . . . . . . . . . . ., 5-56 . . . . . . Formatter and Device Resolution . . . . . . ... 5-56 . . . . Lo, 5-56 . . . . . . . . . . . . . . . . .. Notation . L . . . . . . . . Character Set . . . . ... L, . . . . . . . . . . . . . . 5-57 e, 5-57 . 5-57 .. . ..., . . . . . . . . . . .. L 5-57 . L, 5-58 . 5-58 Character Size . PAGE CONTROL. . e FONT AND CHARACTER SIZE CONTROL . Fonts . e . Numerical Expressions . . e . Numerical Parameter Input . . e . . . . . . . L s . . . . . . . . . TEXT FILLING, ADJUSTING, AND CENTERING e s 5-59 . 5-60 . . . T Filling and Adjusting . . . . . . . . . .., 5-60 Interrupted Text . . . . . . . . . . s .. 9-60 VERTICAL SPACING. . . . . . . . o . . . Base-Line Spacing . . . . Extra Line-Space. . . . . e . . e e . . . . . LINE LENGTH AND INDENTING . . . . . Blocks of Vertical Space . . . . . . e Copy Mode Input Interpretation. . . . L L L Diversions . . . . . . L L Traps . . . . NUMBER REGISTERS . L . . s . L, 5-61 . 561 e 5-61 . .. 5-61 . . .. 5-62 . .. 5-62 ..o 5-63 s 5-63 e e . . . . . . L . . . . . e s o MACROS, STRINGS, DIVERSION, AND POSITION TRAPS. Arguments . s . . . . . . . . . . . . . . ... . . . . e 5-63 5-64 . . . . . . . o o . 5-65 Table of Contents xxiii Tabs and Leaders. . . . . . . . . . 5-66 Fields 5-66 . . . . . . . . . . .. e s e e, INPUT AND OUTPUT CONVENTIONS AND CHARACTER TRANSLATIONS. Input Character Translations . Ligatures. . . . . . . . . L o . . . . . . . . . 5-66 ..o e e e 5-66 5-66 . . . . . . .. .. .. .. .. e e e e e e e e e e e e . . . . .00 . . . . . . . . . . . . ... ... 5—-66 5-67 5-67 5-67 5-67 . . . . . Lo Backspacing, Underlining, Overstriking, Etc.. . . Control Characters . . . . . . . . . . . . Output Translation. . . . . . . . . e e e e Transparent Throughput . . . . . . . . . . . . Comments and Concealed Newlines . . . . . . . . . e . . . . e . LOCAL HORIZONTAL AND VERTICAL MOTIONS, AND THE WIDTH FUNCTION Local Motions . . . . . . . .« . . . . . . . . . . . . Mark Horizontal Place . . . . . . . Width Function e e e e e L . . . . . . . . e . 5-67 e e 5—67 e e 5-68 e 5-68 s e e oo OVERSTRIKE, BRACKET, LINE-DRAWING, AND ZERO-WIDTH FUNCTIONS Overstriking . . . . . . Zero-Width Characters . . . . . . . . . . . Large Brackets . Line Drawing. . . . . . . . . . . . . .. . . . . . . THREE PART TITLES . . HYPHENATION . . . . . . . . . 5-68 e e e e e 5-68 5-68 ..o Lo L e e e e e e e e 5-68 5-68 e . . . . e e e . . . . o s e e e e . . . . . . . . . . o oo o OUTPUT LINE NUMBERING . . . . . . . . o o e e CONDITIONAL ACCEPTANCE OF INPUT . . . . . . . . . . . . . . . ENVIRONMENT SWITCHING . . . . . . . . . o oo o INSERTIONS FROM THE STANDARD INPUT . . . . . . . . . . . . . e e e, e e e e . .. ... e e e . . . .. ... 5-69 5-70 5-70 5-71 5-T71 5-72 INPUT/OUTPUT FILE SWITCHING e . . . MISCELLANEOUS . . . . . . . OUTPUT AND ERROR MESSAGES . . . . . . . . . . o o0 o e . . o e . e . . e o . Introduction . . . . . . .o Page Margins. . . . . . . . . . L Paragraphs and Headings . . . . . . . . . . . . Multiple Column Output . . . . . . . . . . . . Footnote Processing. . . . . . . . . . . . . . . TUTORIAL EXAMPLES The Last Page . . . . s . . . . . . . . . . . . . . . e e 5-72 s oo e e 5-72 5-73 e e 5-74 s e e e e e e e 5-74 e e e 5-74 Lo 5-75 e e e e e e e . . . . . ..o . oL o . e e e e e e s e SUMMARY OF CHANGES TO N/TROFF SINCE OCTOBER 1976 MANUAL. . . e 5-75 e 5-76 e 5-T7 . 5-81 . . . A TROFF TUTORIAL INTRODUCTION . . . . . . . POINT SIZES; LINE SPACING . o o . . o . . . s . . . . . . e e . . FONTS AND SPECIAL CHARACTERS . . . . . . . . . . . INDENTS AND LINE LENGTHS . . . . . . . . . o o TABS. . . . . e e e oo 585 5-86 e e e 5-86 . . . . o . . 5-83 5-84 oo, . . e, oo, ... ... . LOCAL MOTIONS: DRAWING LINES AND CHARACTERS. STRINGS. s o . . .. oo . . . o . . .. . . .. e s e e e INTRODUCTION TO MACROS. . . . . . TITLES, PAGES AND NUMBERING . . . NUMBER REGISTERS AND ARITHMETIC. 5-87 5-88 . . . .. . .. e 5-89 . . . . . . . . . . . . . . . . . o . . . . o oo .. 5-90 .. .. .. 5-91 xxiv Table of Contents A TROFF TUTORIAL (continued) MACROS WITH ARGUMENTS . . CONDITIONALS . . . ENVIRONMENTS . . . DIVERSIONS. . . . . . . . . . . . . . . . . . o . . . . . . . e 5-94 o e e e 5-94 . o oottt . . . . . . . LANGUAGE DESIGN. . . . . . . . . ... 5-96 it 5-97 it . . . . e . . . . . . . . 5-98 i 5-98 5-99 .o e, . . . . . . . s, LANGUAGE THEORY . . . . . . . EXPERIENCE . 5-93 . PHOTOCOMPOSITION. THE LANGUAGE. . 5-92 e o APPENDIX A: PHOTOTYPESETTER CHARACTERSET . INTRODUCTION . e, o o e e . o e e . . CONCLUSIONS . . i, 5-101 . . . . . . .. 5-102 . . . . o . oo 5-103 TYPESETTING MATHEMATICS - USER’S GUIDE INTRODUCTION . o o oo oo e, 5-105 DISPLAYED EQUATIONS . . . . . . . o INPUT SPACES o o . . o o o o o OUTPUT SPACES . . . . . o . . . . . . . o o BRACES FOR GROUPING . . . . FRACTIONS o e e . o o o o SQUARE ROOTS . . . . o o o oo SUMMATION, INTEGRAL, ETC.. SIZE AND FONT CHANGES . . DIACRITICAL MARKS . . QUOTED TEXT 5-106 . . . . . . . . . o o o oo 5-106 .\ oo . . . . . . . o o oo o o oo BIG BRACKETS, ETC. . . o o o o oo MATRICES. e . . . . . . o . . . . oo i 5-108 5-108 5-109 e 5-109 5-110 5-110 . o, 5-111 SHORTHAND FOR IN-LINE EQUATIONS DEFINITIONS e 5-107 s e 5-109 o o i 5-106 5-107 © o o 5-107 oo o o o .« o . 5-106 . . o o . . o . . . oo . . . . LINING UP EQUATIONS . PILES . . 5-105 o SUBSCRIPTS AND SUPERSCRIPTS . s 5-105 e oo SYMBOLS, SPECIAL NAMES, GREEK . SPACES, AGAIN . o . . . . . . . . o i i 5-111 . o o o oo oo o LOCAL MOTIONS . . . o o oo o o 5-112 ALARGE EXAMPLE . . . . . o . 5-112 . 5-111 KEYWORDS, PRECEDENCES, ETC. . TROUBLESHOOTING . USE ON UNIX . . . . o o i oot 5-112 o 5-114 oo INTRODUCTION . . . . . . o o INPUT COMMANDS . . . . . . . . o oo . . i 5-113 o . . o . . . v . USAGE . . o o e e e e e s s e e 5-115 e 5-116 o e e e e 5-120 Table of Contents xxv REFER - A BIBLIOGRAPHY SYSTEM INTRODUCTION . . . . . . o o o o o o o e DATA ENTRY WITH ADDBIB . . . . . . . .« .« . PRINTING THE BIBLIOGRAPHY. . . . . . . . . . . CITING PAPERS WITH REFER . . . . . . . . . . . REFER’S COMMAND-LINE OPTIONS . . . . . . . . MAKING AN INDEX . . . . . . o o o oo o o REFER BUGS AND SOME SOLUTIONS . . . . . . . INTERNAL DETAILS OF REFER. . . . . . . . . .. CHANGING THE REFER MACROS. . . . . . . . . . s e e e e e e e e 5-133 . o o o o oo 5-134 . . o . ... 5-135 . . o o o« oo .. 5-136 . . . . oo, 5-137 e e 5-137 . . . . . o ... 5-138 e e e e e e e e e 5-139 . .« o o .. 5-141 SOME APPLICATIONS OF INVERTED INDEXES ()N THE UNIX SYSTEM INTRODUCTION . . SEARCHING . . . . . . . . o s e e e Make Keys. . . . . . . . . Hash and Invert . . . . . . Searching and Retrieving . . . . . . . . . . . . . . . . e e e d e e e e e e e e e e e e 5-144 e e 5-144 e e e e 5-147 e e e e e e 5-147 . . . Lo Lo 5-148 SELECTING AND FORMATTING REFERENCES FORTROFF . . . . . . . . . . . .. 5-150 REFERENCE FILES . . . . . . s e COLLECTING REFERENCES AND OTHER REFER OPTIONS . . . e s . . . e . e 5-151 . . . . .. 5-154 UPDATING PUBLICATION LISTS INTRODUCTION . . . . . . . o o o PUBLICATION FORMAT . . . . . . UPDATING AND RE-INDEXING. . PRINTING A PUBLICATION LIST . o . . . oo . . . . . . o o o . . . . . . . . . . e e e e e e e e e e e 5-155 oo e e e 5-155 . . .« o o o o oo 5-157 . . o o o oo 5-161 WRITING TOOLS - THE STYLE AND DICTION PROGRAMS INTRODUCTION . . . . . o e e e e e e e e e e e s s e 5-163 STYLE . . . . o e e e e e 5-163 What is a Sentence? . . . . . . . . . O 5-164 Readability Grades . . . . . . . . . . . . L ..o e e e 5-165 Sentence Length and Structure . . . . . . . . . . . . . .00 00000 5-166 Word Usage . . . . . . . . . . Sentence Openers. . . . . . . . e . . . . e e 5-168 DICTION . . EXPLAIN. . RESULTS . . . . . . . . . . . . STYLE . . . . e e e e e e e 5-170 . . . . . . DICTION . . . . . . . o o ACCURACY . o e e e e e 5-167 oo e e e e e e e e . o e e e e e e e e 5-169 e e e e e s e 5-170 . .. ... P 5-170 e o . . 5-171 . o o e e e e e e 5-172 Sentence Identification . . . . . . . . . . L L. e e e e 5-172 Sentence Types. . . . . . . . L e e e e e e 5-172 Word Usage . . . . . . . . . L e e e e e 5-172 xxvi Table of Contents WRITING TOOLS - THE STYLE AND DICTION PROGRAMS (continued) TECHNICAL DETAILS . Finding Sentences . . Details of DICTION CONCLUSIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ... . ... e 5-172 oo . e e e e e e e e s s APPENDIX 1: STYLE ABBREVIATIONS . . . APPENDIX 2: DEFAULT DICTION PATTERNS PART 6: . e e e e e e e 5-172 e 5-173 e e e . . . . . . . . . . .. .. s 5-173 . . . . . . . . . . .. .. ... e e ... 5-175 ... 5-176 MISCELLANEOUS LEARN - COMPUTER-AIDED INSTRUCTION ON UNIX INTRODUCTION . . . . . . o o oo oo EDUCATIONAL ASSUMPTIONS AND DESIGN. SCRIPTS . . . . o . o . . . s . . EXPERIENCE WITH STUDENTS . . . . . . . . . . . . . . . . . . . o s e e e APPENDIX A: HOW TO GET STARTED . . . . . . . . . s e . e . . . e ... e 6-3 ... .. .. 6-4 e s e e e 6-6 THE SCRIPT INTERPRETER CONCLUSIONS . . . . . . . o . .« oo o oo s . o oo o 6-8 o oo oo 6-8 e e e e 6-12 . .. . ... 6-15 s A GUIDE TO THE DUNGEONS OF DOOM INTRODUCTION . . . . o WHAT IS GOING ON HERE?. . . . . . . e . . . e . e s e 6-17 e e e 6-17 . . . .. 6-18 e 6-18 e 6-18 e 6-19 . o e s o o . . WHAT DO ALL THOSE THINGS ON THE SCREEN MEAN?. The Bottom Line. . . . . . . . . . . . . . . . . . . The Rest of the Screen . . . . . . . The Top Line . . . . . . e e . . . s s . . s e . . . e e e e . e e e e e COMMANDS . . e e s e s e e 6-19 ROOMS . . FIGHTING . e e e e s s e e e . . . . e e e e s e e s e s e 6-21 . . . . . e OBJECTS YOU CAN FIND . . . . . . . . Weapons . . . . ATMOr . . L L L Scrolls . . . . Potions. . . . . . . o L e e o e e o e e e e e e e e e e e e s e 6-21 e 6-22 6-22 6-22 . . . 6-22 Staves and Wands . e . . . . . . e e e e e L e e e 6-23 . . Food. . . . . . e e e e s e e 6-23 OPTIONS. . . . e s e s 6-23 . . . . . . . . . Using the ‘O’ Command . . . . . . . . . . . . . . . . . . . . . . . Using the ROGUEOPTS Variable Option List. SCORING . . . . . . . . e . . . . s s e 6-23 . . e e . . e e . Setting the Options. e e, Rings . e e . L e e e s e e . e e L . e e e . e e 6-21 e e e e, 6-23 .. e e 6-23 e e e e 6-23 . . . e e e e 6-24 . . s e e e 6-24 Table of Contents xxvii BERKELEY FONT CATALOGUE INTRODUCTION . . . . . . oo APL FONT, 10 POINT ONLY . oo . . . . . . . . . s . o BASKERVILLE FONT, ROMAN, IBOLD, ITALIC, 12 POINT ONLY . . BOCKLIN FONT, 14 AND 28 POINT ONLY. . . . . . . . CHESS, 18 POINT ONLY. . . . . . . . . . CLARENDON, 14 AND 18 POINT ONLY . . . . . . . . . s, 6-27 . 6-29 . . . . .. .. .. ... .. . . . . . . . BODONI FONT, ROMAN, BOLD, ITALIC, 10 POINT ONLY. e o . . . . . . . . . . . . .. . 6-29 6-30 ... 6-31 . 6-32 . . . . . .. ... ... 6-33 COMPUTER MODERN FONTS, ROMAN, ITALIC, ANDBOLD . . . . . . . ... ... 6-34 COUNTDOWN, . . . . . . .. ... 6-35 . 6-35 22 POINT, UPPERCASE LETTERSONLY . CYRILLIC, 12 POINT ONLY . . . . . . . . DELEGATE, ROMAN, ITALIC, ANDBOLD. . . . . . . . . . . . . . . . . . . . ... .. .. .. .... 6-35 . . . . . . . . . . . . . . ... 6-36 . . . . . . . . . . . ... ... 6-38 o 6-38 GACHAM, ROMAN, BOLD, ITALIC, 10 POINT ONLY . . . FIX FIXED WIDTH FONT, 6, 9, 10, 12, 14 POINT. GREEK, 10 POINT ONLY . o HEBREW, 16, 24, AND 36 POINT ONLY . . 10 POINT HERSHEY . . . . . . . . . . HERSHEY FONT. . . . . . . o, . . . . . . . . . . . . . .. . ... . s .... 6-40 s ... 6-41 . . . . . . . . . . . . . ... 6-43 MICROGRAMMA FONT, 10 POINT ONLY . . . . . . . . .. . .. . ... . ... 6-44 MONA FONT, 24 POINT ONLY . . . . 6-44 . . . . . . . . . 6-42 METEOR, ROMAN, BOLD, ITALIC, 8, 10, 12 POINT . NONIE, ROMAN, BOLD, ITALIC, 8, 10, 12 POINT . . . . . . . . . . . . . OLD ENGLISH, 8, 14, AND 18 POINT ONLY . . . . . . . . . . . . . ... . . . . .. . . . . PIP FONT, 16 POINT ONLY, NO LOWER CASE . . . . . . PLAYBILL FONT, 18 POINT ONLY. . . . . . . SCRIPT, 18 POINT ONLY . . . . . . . . . . . .. . . . ... SHADOW, 16 POINT ONLY, NO LOWER CASE SIGN, 22 POINT ONLY. . STARE HERSHEY FONT. . . ... e . . . . . . USAGE. . . . . . . . e . . . 6-47 ... 6-48 ... 6-48 e e e 6-48 . . . 6-49 . . . . ... ... . . . . . . o 6-49 . . . . . . . . . .. 6-50 . 6-51 . . . e e . 6-45 .... . e . ... . LEXICAL CONVENTIONS . Identifiers . ... . . . . o TIMES FONTS, ROMAN, ITALIC, AND BOLD. 10 POINTONLY . INTRODUCTION . .. . . s . . s e . . . . . .. s, 6-53 e 6-53 . 6-53 . . . . . . . L L, 6-53 . . . . ., 6-54 Constants . . . . . . . . . . .. R 6-54 Operators . . . . . . . ., . Comments . . . . . . Temporary Symbols Blanks . SEGMENTS . . . . . . . . . . . L 6-54 6-54 L, 6-54 6-54 STATEMENTS . . . . Labels . . . o . . . THE LOCATION COUNTER . . . Null Statements . . . . . . . . . . . 6-55 6-55 6-55 . . . . . . . . . 6-55 Expression Statements . . . . . . . 6-55 . . . L L, . . . Assignment Statements. . . . . . . . . . ...... 6-56 String Statements . . . . L L L L L L L, 6-56 . . . . . . . . . . L, 6-56 . Keyword Statements . . . L . . . . . . . .. .. xxviii Table of Contents UNIX ASSEMBLER REFERENCE MANUAL (continued) e 6-56 Expression Operators . . . . . . . . . .« . o e e e e e e e e e e e e e TYPES . . . o o000 Type Propagation in Expressions . . . . . . . . . . . . 6-57 6-57 6-58 e- 6-59 e e e e byte. . . s e e e e e e e e e e e BVEIL. . o v v e e e e 3 P e e e e e e endif . . . . . globl . . . e e e e e e e e e e e e e e e e e LEXE . . . e e e e e e e e e e e e data. . . . . s e e e e e e e DSS . . e e e e e e e e e e e e e e COMM . . v v v e e e e e e e e 6-59 6-59 6-59 6-59 6-59 6-59 6-59 6-59 6-60 e e e e 6-60 e e e e . . . . . . . . o o e e e . . . . . . . . .. ..o e e e e e e e e e e e e e e . . . . . . . . . e . . . . . . . . . . . ..o . . . . . . . . . . . . o000 e e e e e e e e . . . . . . . . . oo . . oo . . . . . . . . . . . . 6-60 6-60 6-61 6-61 6-62 6-62 6-62 6-63 e 6-63 e e . Symbol . . L e e . . . . System Calls . . . . . . 6-63 6-64 e EXPRESSIONS . . . . . o PSEUDO-OPERATIONS e e e e . . . . Sources and Destinations . . . . Simple Machine Instructions . . Branch. . . . . . . Extended Branch Instructions. . Single Operand Instructions. . . Double Operand Instructions . . Miscellaneous Instructions . . . Floating-Point Unit Instructions. e OTHER SYMBOLS . . . . . DIAGNOSTICS . . . . . o o e e e e e e e e e e MACHINE INSTRUCTIONS . . . . e e e e e e e e e e e e e e s s e e e e e e e e e e 6 -64 Introduction 1-1 PART 1: OVERVIEW The first two articles in this volume introduce the entire three-volume set of ULTRIX Supplementary Documents. The article entitled “UNIX/32V — Summary” lists features of the UNIX system released in March 1979. ULTRIX-32 is based on the Berkeley 4.2BSD distribu- tion, which is in turn based on Bell Laboratories UNIX 32V and the UNIX 7th Edition. The second article, “The UNIX Time-Sharing System,” by Ritchie and Thompson, provides an overview and history of UNIX. system. The authors are the original developers of this software This article is suitable for readers who are familiar with computer software and operating systems. Although it describes UNIX as it was implemented in 1974, the article remains an important part of the UNIX documentation. With the exception of some details, it gives an accurate account of many of the concepts and features of ULTRIX-32. The authors convey the spirit of UNIX and ULTRIX-32, though the article includes some information that is no longer current. “The UNIX Time-Sharing System” explains these notable features of UNIX: » A pipe enables related processes to pass information between the related processes. o A filter takes its input from one process and delivers its output to another process. ¢« A shell serves as a user 'interface to the system. e An image is a computer execution environment. A process is the execution of an image. e A process may create another process. The creating process is the parent; the created process is the child. The article also tells how to: e Kxecute procedures in background, leaving your terminal free to perform other functions while the background procedures run. e (Create user interfaces that serve as alternatives to the shells. Set up restricted environments for some users. e Detect and deal with hardware and software errors. Be sure to read the last part of “The UNIX Time-Sharing System” if you want to know about the early stages of UNIX development. Ritchie and Thompson explain their original goals and design considerations, and they identify important steps in the evolution of the software system that forms the basis of ULTRIX-32. UNIX 32/V — Summary 1-3 UNIX/32V — Summary March 9, 1979 A. What’s new: highlights of the UNIX{/32V System 32-bit world. UNIX/32V handles 32-bit addresses and 32-bit data. Devices are addressable to 21 bytes, files to 930 bytes. Portability. Code of the operating system and most utilities has been extensively revised to minimize its dependence on particular hardware. UNIX/32V is highly compatible with UNIX version 7. Fortran 77. level. F77 compiler for the new standard language is compatible with C at the object A Fortran structurer, STRUCT, converts old, ugly Fortran into RATFOR, a structured dialect usable with F77. Shell. Completely new SH program supports string variables, trap handling, structured pro- gramming, user profiles, settable search path, multilevel file name generation, etc. Document preparation. TROFF phototypesetter utility is standard. nals) is now highly compatible with TROFF. NROFF (for termi- MS macro package provides canned commands for many common formatting and layout situations. TBL provides an easy to learn language for preparing complicated tabular material. REFER fills in bibliographic citations from a data base. UNIX-to-UNIX machines. file Data processing. copy. UUCP performs spooled file transfers between any two SED stream editor does multiple editing functions in parallel on a data stream of indefinite length. AWK report generator does free-field pattern selection and arithmetic operations. Program development. MAKE controls re-creation of complicated software, arranging for minimal recompilation. Debugging. ADB does postmortem and breakpoint debugging. C language. The language now supports definable data types, generalized initialization, block structure, long integers, unions, explicit type conversions. The LINT verifier does strong type checking and detection of probable errors and portability problems even across separately compiled functions. Lexical analyzer generator. LEX converts specification of regular expressions and semantic actions into a recognizing subroutine. Analogous to YACC. Graphics. Simple graph-drawing utility, graphic subroutines, and generalized plotting filters adapted to various devices are now standard. Standard input-output package. formatted input and output. Other. Highly efficient buffered stream I/O is integrated with | The operating system and utilities have been enhanced and freed of restrictions in many other ways too numerous to relate. t UNIX is a Trademark of Bell Laboratories. 1-4 UNIX 32/V — Summary B. Hardware The UNIX/32V operating system runs on a DEC VAX-11/780* with at least the following equipment: memory: 256K bytes or more. disk: RP06, RMO03, or equivalent. tape: any 9-track MASSBUS-compatible tape drive. The following equipment is strongly recommended: communications controller such as DZ11 or DL11. full duplex 96-character ASCII terminals. extra disk for system backup. The system is normally distributed on 9-track tape. The minimum memory and disk space specified is enough to run and maintain UNIX/32V, and to keep all source on line. More memory will be needed to handle a large number of users, big data bases, diversified complements of devices, or large programs. The resident code occupies 40-55K bytes depending on configuration; system data also occupies 30-65K bytes. C. Software Most of the programs available as UNIX/32V commands are listed. Source code and printed manuals are distributed for all of the listed software except games. Almost all of the code is written in C. Commands are self-contained and do not require extra setup information, unless specifically noted as “interactive.” Interactive programs can be made to run from a prepared script simply by redirecting input. Most programs intended for interactive use (e.g., the editor) allow for an escape to command level (the Shell). Most file processing commands can also go from standard input to standard output (“filters”). The piping facility of the Shell may be used to connect such filters directly to the input or output of other programs. 1. Basic Software This includes the time-sharing operating system with utilities, and a compiler for the programming language C—enough software to write and run new applications and to maintain or modify UNIX/32V itself. 1.1. Operating System [0 UNIX The basic resident code on which everything else depends. Supports the system calls, and maintains the file system. A general description of UNIX design philosophy and system facilities appeared in the Communications of the ACM, July, 1974. A more extensive survey is in the Bell System Technical Journal for July-August 1978. Capabilities include: O Reentrant code for user processes. O “Group” access permissions for cooperative projects, with overlapping memberships. O Alarm-clock timeouts. | O Timer-interrupt sampling and interprocess monitoring for debugging and measurement. O Multiplexed I/O for machine-to-machine communication. CO0DEVICES All I/O is logically synchronous. I/O devices are simply files in the file system. Normally, invisible buffering makes all physical record structure and device characteristics transparent and exploits the hardware’s ability to do *VAX is a Trademark of Digital Equipment Corporation. UNIX 32/V — Summary 1-5 overlapped I/0. Unbuffered physical record I/0 is available for unusual applications. Drivers for these devices are available: O Asynchronous interfaces: DZ11, DL11. terminals. Support for most common ASCII | O Automatic calling unit interface: DN11. O Printer/plotter: Versatek. O Magnetic tape: TE16. O Pack type disk: RP06, RMO03; minimum-latency seek scheduling. O Physical memory of VAX-11, or mapped memory in resident system. O Null device. O Recipies are supplied to aid the construction of drivers for: Asynchronous interface: DH11. Synchronous interface: DU11. DECtape: TC11. Fixed head disk: RS11, RS03 and RS04. Cartridge-type disk: RKO05. Phototypesetter: Graphic Systems System/1 through DR11C. 0 BOOT Procedures to get UNIX/32V started. 1.2. User Access Control Sign on as a new user. O Verify password and establish user’s individual and group (project) identity. O Adapt to characteristics of terminal. O Establish working directory. O Announce presence of mail (from MAIL). O Publish message of the day. O Execute user-specified profile. O Start command interpreter or other initial program. 0 PASSWD Change a password. O User can change his own password. O Passwords are kept encrypted for security. O NEWGRP Change working group (project). Protects against unauthorized changes to projects. 1.3. Terminal Handling 0 TABS OSTTY Set tab stops appropriately for specified terminal type. Set up options for optimal control of a terminal. In so far as they are deduci- ble from the input, these options are set automatically by LOGIN. O Half vs. full duplex. | O Carriage return+line feed vs. newline. O Interpretation of tabs. O Parity. O Mapping of upper case to lower. O Raw vs. edited input. O Delays for tabs, newlines and carriage returns. 1.4. File Manipulation 0 CAT Concatenate one or more files onto standard output. Particularly used for unadorned printing, for inserting data into a pipeline, and for buffering output that comes in dribs and drabs. Works on any file regardless of contents. 1-6 UNIX 32/V — Summary O CP Copy one file to another, or a set of files to a directory. Works on any file regardless of contents. 0 PR Print files with title, date, and page number on every page. O Multicolumn output. O Parallel column merge of several files. 0 LPR Off-line print. Spools arbitrary files to the line printer. O CMP Compare two files and report if different. 0 TAIL OO0 SPLIT Print last n lines of input O May print last n characters, or from n lines or characters to end. Split a large file into more manageable pieces. Occasionally necessary for editing (ED). 0 DD Physical file format translator, for exchanging data with foreign systems, especially IBM 370’s. O SUM Sum the words of a file. 1.5. Manipulation of Directories and File Names O RM Remove a file. Only the name goes away if any other names are linked to the file. O Step through a directory deleting files interactively. O Delete entire directory hierarchies. O LN “Link” another name (alias) to an existing file. 0 MV Move a file or files. Used for renaming files. O CHMOD Change permissions on one or more files. Executable by files’ owner. 0 CHOWN Change owner of one or more files. 0 CHGRP Change group (project) to which a file belongs. 0 MKDIR Make a new directory. 0 RMDIR Remove a directory. O0CD Change working directory. [0 FIND Prowl the directory hierarchy finding every file that meets specified criteria. O Criteria include: name matches a given pattern, creation date in given range, date of last use in given range, given permissions, given owner, given special file characteristics, boolean combinations of above. O Any directory may be considered to be the root. O Perform specified command on each file found. 1.6. Running of Programs [0 SH The Shell, or command language interpreter. O Supply arguments to and run any executable program. O Redirect standard input, standard output, and standard error files. UNIX 32/V — Summary 1-7 O Pipes: simultaneous execution with output of one process connected to the input of another. O Compose compound commands using: if ... then ... else, case switches, while loops, for loops over lists, break, continue and exit, parentheses for grouping. O Initiate background processes. O Perform Shell programs, i.e., command scripts with substitutable arguments. O Construct argument lists from all file names satisfying specified patterns. O Take special action on traps and interrupts. O User-settable search path for finding commands. O Executes user-settable profile upon login. O Optionally announces presence of mail as it arrives. O Provides variables and parameters with default setting. O TEST Tests for use in Shell conditionals. O String comparison. O File nature and accessibility. O Boolean combinations of the above. O EXPR String computations for calculating command arguments. O Integer arithmetic O Pattern matching 00 WAIT Wait for termination of asynchronously running processes. [0 READ Read a line from terminal, for interactive Shell procedure. ECHO Print remainder of command line. Useful for diagnostics or prompts in Shell programs, or for inserting data into a pipeline. O SLEEP Suspend execution for a specified time. OO0 NOHUP Run a command immune to hanging up the terminal. O NICE Run a command in low (or high) priority. O KILL Terminate named processes. O CRON Schedule regular actions at specified times. O Actions are arbitrary programs. O Times are conjunctions of month, day of month, day of week, hour and minute. Ranges are specifiable for each. AT O TEE Schedule a one-shot action for an arbitrary time. Pass data between processes and divert a copy into one or more files. 1.7. Status Inquiries OLS List the names of one, several, or all files in one or more directories. O Alphabetic or temporal sorting, up or down. O Optional information: size, owner, group, date last modified, date last accessed, permissions, i-node number. O FILE Try to determine what kind of information is in a file by consulting the file system index and by reading the file itself. 1-8 UNIX 32/V — Summary 0 DATE Print today’s date and time. Has considerable knowledge of calendric and horological peculiarities. O May set UNIX/32V’s idea of date and time. O DF Report amount of free space on file system devices. 0 DU Print a summary of total space occupied by all files in a hierarchy. OQUOT Print summary of file space usage by user id. O WHO Tell who’s on the system. | O List of presently logged in users, ports and times on. O Optional history of all logins and logouts. OPS Report on active processes. O List your own or everybody’s processes. O Tell what commands are being executed. O Optional status information: state and scheduling info, priority, attached terminal, what it’s waiting for, size. O IOSTAT Print statistics about system I/0 activity. OTTY Print name of your terminal. 0 PWD Print name of your working directory. 1.8. Backup and Maintenance O MOUNT Attach a device containing a file system to the tree of directories. Protects against nonsense arrangements. [0 UMOUNT Remove the file system contained on a device from the tree of directories. Protects against removing a busy device. O MKFS [0 MKNOD Make a new file system on a device. Make an i-node (file system entry) for a special file. Special files are physical devices, virtual devices, physical memory, etc. OTP O TAR Manage file archives on magnetic tape or DECtape. TAR is newer. O Collect files into an archive. O Update DECtape archive by date. O Replace or delete DECtape files. O Print table of contents. O Retrieve from archive. O DUMP Dump the file system stored on a specified device, selectively by date, or indiscriminately. O RESTOR Restore a dumped file system, or selectively retrieve parts thereof. O SU Temporarily become the super user with all the rights and privileges thereof. Requires a password. 0 DCHECK O ICHECK 0 NCHECK Check consistency of file system. O Print gross statistics: number of files, number of directories, number of special files, space used, space free. UNIX 32/V — Summary 1-9 O Report duplicate use of space. O Retrieve lost space. O Report inaccessible files. O Check consistency of directories. O List names of all files. O CLRI Peremptorily expunge a file and its space from a file system. Used to repair damaged file systems. OSYNC Force all outstanding I/O on the system to completion. Used to shut down gracefully. 1.9. Accounting The timing information on which the reports are based can be manually cleared or shut off completely. O AC Publish cumulative connect time report. O Connect time by user or by day. O For all users or for selected users. [0 SA Publish Shell accounting report. Gives usage information on each command executed. O Number of times used. O Total system time, user time and elapsed time. O Optional averages and percentages. O Sorting on various fields. 1.10. Communication O MAIL Mail a message to one or more users. Also used to read and dispose of incoming mail. The presence of mail is announced by LOGIN and optionally by SH. O Each message can be disposed of individually. O Messages can be saved in files or forwarded. O CALENDAR Automatic reminder service for events of today and tomorrow. 00 WRITE Establish direct terminal communication with another user. [0 WALL Write to all users. [ MESG Inhibit receipt of messages from WRITE and WALL. OCU Call up another time-sharing system. O Transparent interface to remote machine. O File transmission. O Take remote input from local file or put remote output into local file. O Remote system need not be UNIX/32V. O UUCP UNIX to UNIX copy. O Automatic queuing until line becomes available and remote machine is up. O Copy between two remote machines. O Differences, mail, etc., between two machines. 1.11. Basic Program Development Tools Some of these utilities are used as integral parts of the higher level languages described in section 2. O AR Maintain archives and libraries. keeping efficiency. Combines several files into one for house- 1-10 UNIX 32/V — Summary O Create new archive. - O Update archive by date. O Replace or delete files. O Print table of contents. O Retrieve from archive. 0 AS Assembler. O Creates object program consisting of code, normally read-only and sharable, initialized data or read-write code, uninitialized data. O Relocatable object code is directly executable without further transformation. O Object code normally includes a symbol table. O “Conditional jump” instructions become branches or branches plus jumps depending on distance. O Library | The basic run-time library. These routines are used freely by all software. O Buffered character-by-character 1/0. O Formatted input and output conversion (SCANF and PRINTF) for standard input and output, files, in-memory conversion. O Storage allocator. O Time conversions. O Number conversions. O Password encryption. O Quicksort. O Random number generator. O Mathematical function library, | including trigonometric functions and inverses, exponential, logarithm, square root, bessel functions. 0 ADB Interactive debugger. O Postmortem dumping. O Examination of arbitrary files, with no limit on size. O Interactive breakpoint debugging with the debugger as a separate process. O Symbolic reference to local and global variables. O Stack trace for C programs. O Output formats: 1-, 2-, or 4-byte integers in octal, decimal, or hex single and double floating point character and string disassembled machine instructions O Patching. O Searching for integer, character, or floating patterns. O OD Dump any file. Output options include any combination of octal or decimal or hex by words, octal by bytes, ASCII, opcodes, hexadecimal. O Range of dumping is controllable. O LD Link edit. Combine relocatable object files. Insert required routines from specified libraries. O Resulting code is sharable by default. O LORDER Places object file names in proper order for loading, so that files depending on others come after them. [0 NM Print the namelist (symbol table) of an object program. Provides control over the style and order of names that are printed. UNIX 32/V — Summary 1-11 O SIZE Report the memory requirements of one or more object files. O STRIP Remove the relocation and symbol table information from an object file to save space. O TIME Run a command and report timing information on it. 0 PROF Construct a profile of time spent per routine from statistics gathered by timesampling the execution of a program. O Subroutine call frequency and average times for C programs. [0 MAKE Controls creation of large programs. Uses a control file specifying source file dependencies to make new version; uses time last changed to deduce minimum amount of work necessary. O Knows about CC, YACC, LEX, etc. 1.12. UNIX/32V Programmer’s Manual [0 Manual Machine-readable version of the UNIX/32V Programmer’s Manual. O System overview. O All commands. O All system calls. O All subroutines in C and assembler libraries. O All devices and other special files. O Formats of file system and kinds of files known to system software. O Boot and maintenance procedures. OO0 MAN Print specified manual section on your terminal. 1.13. Computer-Aided Instruction O LEARN A program for interpreting CAI scripts, UNIX/32V by using it. plus scripts for learning about | O Scripts for basic files and commands, editor, advanced files and commands, EQN, MS macros, C programming language. 2. Languages 2.1. The C Language O CC Compile and/or link edit programs in the C language. The UNIX/32V operat- ing system, most of the subsystems and C itself are written in C. For a full description of C, read The C Programming Language, Brian W. Kernighan and Dennis M. Ritchie, Prentice-Hall, 1978. O General purpose language designed for structured programming. O Data types include character, integer, float, double, pointers to all types, functions returning above types, arrays of all types, structures and unions of all types. O Operations intended to give machine-independent control of full machine facility, including to-memory operations and pointer arithmetic. O Macro preprocessor for parameterized code and inclusion of standard files. O All procedures recursive, with parameters by value. O Machine-independent pointer manipulation. O Object code uses full addressing capability of the VAX-11. O Runtime library gives access to all system facilities. O Definable data types. 1-12 UNIX 32/V — Summary O Block structure [0 LINT Verifier for C programs. Reports questionable or nonportable usage such as: Mismatched data declarations and procedure interfaces. Nonportable type conversions. Unused variables, unreachable code, no-effect operations. Mistyped pointers. Obsolete syntax. O Full cross-module checking of separately compiled programs. O0CB A beautifier for C programs. Does proper indentation and placement of braces. 2.2. Fortran O K77 A full compiler for ANSI Standard Fortran 77. O Compatible with C and supporting tools at object level. O Optional source compatibility with Fortran 66. O Free format source. O Optional subscript-range checking, detection of uninitialized variables. O All widths of arithmetic: 2- and 4-byte integer; 4- and 8-byte real; 8- and 16-byte complex. 0 RATFOR , Ratfor adds rational control structure a la C to Fortran. O Compound statements. O If-else, do, for, while, repeat-until, break, next statements. O Symbolic constants. O File insertion. O Free format source O Translation of relationals like >, >=. O Produces genuine Fortran to carry away. O May be used with F77. [0 STRUCT Converts ordinary ugly Fortran into structured Fortran (i.e., Ratfor), using statement grouping, if-else, while, for, repeat-until. 2.3. O DC Other Algorithmic Languages Interactive programmable desk calculator. Has named storage locations as well as conventional stack for holding integers or programs. O Unlimited precision decimal arithmetic. O Appropriate treatment of decimal fractions. O Arbitrary input and output radices, in particular binary, octal, decimal and hexadecimal. O Reverse Polish operators: + —*/ remainder, power, square root, load, store, duplicate, clear, print, enter program text, execute. [0BC A C-like interactive interface to the desk calculator DC. O All the capabilities of DC with a high-level syntax. O Arrays and recursive functions. O Immediate evaluation of expressions and evaluation of functions upon call. O Arbitrary precision elementary functions: exp, sin, cos, atan. O Go-to-less programming. UNIX 32/V — Summary 1-13 2.4. Macroprocessing [0 M4 A general purpose macroprocessor. O Stream-oriented, recognizes macros anywhere in text. O Syntax fits with functional syntax of most higher-level languages. O Can evaluate integer arithmetic expressions. 2.5. Compiler-compilers [0 YACC An LR(1)-based compiler writing system. During execution of resulting parsers, arbitrary C functions may be called to do code generation or semantic actions. O BNF syntax specifications. O Precedence relations. O Accepts formally ambiguous grammars with non-BNF resolution rules. LEX Generator of lexical analyzers. Arbitrary C functions may be called upon isolation of each lexical token. O Full regular expression, plus left and right context dependence. O Resulting lexical analysers interface cleanly with YACC parsers. 3. Text Processing 3.1. Document Preparation O ED Interactive context editor. Random access to all lines of a file. O Find lines by number or pattern. Patterns may include: specified characters, don’t care characters, choices among characters, repetitions of these constructs, beginning of line, end of line. O Add, delete, change, copy, move or join lines. O Permute or split contents of a line. O Replace one or all instances of a pattern within a line. O Combine or split files. O Escape to Shell (command language) during editing. O Do any of above operations on every pattern-selected line in a given range. O Optional encryption for extra security. OPTX O SPELL Make a permuted (key word in context) index. Look for spelling errors by comparing each word in a document against a word list. O 25,000-word list includes proper names. O Handles common prefixes and suffixes. O Collects words to help tailor local spelling lists. O LOOK Search for words in dictionary that begin with specified prefix. O CRYPT Encrypt and decrypt files for security. 3.2. Document Formatting O TROFF O NROFF Advanced typesetting. TROFF drives a Graphic Systems phototypesetter; NROFF drives ascii terminals of all types. This summary was typeset using TROFF. TROFF and NROFF are capable of elaborate feats of formatting, when appropriately programmed. TROFF and NROFF accept the same input language. 1-14 UNIX 32/V — Summary O Completely definable page format keyed to dynamically planted “interrupts” at specified lines. O Maintains several separately definable typesetting environments (e.g., one for body text, one for footnotes, and one for unusually elaborate headings). O Arbitrary number of output pools can be combined at will. O Macros with substitutable arguments, and macros invocable in mid-line. O Computation and printing of numerical quantities. O Conditional execution of macros. O Tabular layout facility. O Positions expressible in inches, centimeters, ems, points, machine units or arithmetic combinations thereof. O Access to character-width computation for unusually difficult layout problems. O Overstrikes, built-up brackets, horizontal and vertical line drawing. O Dynamic relative or absolute positioning and size selection, globally or at the character level. O Can exploit the characteristics of the terminal being used, for approximating special characters, reverse motions, proportional spacing, etc. The Graphic Systems typesetter has a vocabulary of several 102-character fonts (4 simultane- ously) in 15 sizes. TROFF provides terminal output for rough sampling of the product. NROFF will produce multicolumn output on terminals capable of reverse line feed, or through the postprocessor COL. High programming skill is required to exploit the formatting capabilities of TROFF and NROFF, although unskilled personnel can easily be trained to enter documents according to canned formats such as those provided by MS, below. TROFF and EQN are essentially ident- ical to NROFF and NEQN so it is usually possible to define interchangeable formats to pro- duce approximate proof copy on terminals before actual typesetting. The preprocessors MS, TBL, and REFER are fully compatible with TROFF and NROFF. O MS A standardized manuscript layout package for use with NROFF/TROFF. This document was formatted with MS. | O Page numbers and draft dates. O Automatically numbered subheads. O Footnotes. O Single or double column. O Paragraphing, display and indentation. O Numbered equations. O EQN A mathematical typesetting preprocessor for TROFF. Translates easily read- able formulas, either in-line or displayed, into detailed typesetting instructions. Formulas are written in a style like this: sigma sup 2 "="1 over N sum from i=1to N ( x sub i — x bar ) sup 2 which produces: N 0= L 5)? ~aYy 1( %) O Automatic calculation of size changes for subscripts, sub-subscripts, etc. O Full vocabulary of Greek letters and special symbols, such as ‘gamma’, ‘GAMMA’, ‘integral’. O Automatic calculation of large bracket sizes. O Vertical “piling” of formulae for matrices, conditional alternatives, etc. O Integrals, sums, etc., with arbitrarily complex limits. UNIX 32/V — Summary 1-15 O Diacriticals: dots, double dots, hats, bars, etc. O Easily learned by nonprogrammers and mathematical typists. 0 NEQN A version of EQN for NROFF; accepts the same input language. Prepares formulas for display on any terminal that NROFF knows about, for example, those based on Diablo printing mechanism. O Same facilities as EQN within graphical capability of terminal. TBL A preprocessor for NROFF/TROFF that translates simple descriptions of table layouts and contents into detailed typesetting instructions. O Computes column widths. O Handles left- and right-justified columns, centered columns and decimalpoint alignment. O Places column titles. O Table entries can be text, which is adjusted to fit. O Can box all or parts of table. 0 REFER Fills in bibliographic citations in a document from a data base (not supplied). O References may be printed in any style, as they occur or collected at the end. O May be numbered sequentially, by name of author, etc. O0TC Simulate Graphic Systems typesetter on Tektronix 4014 scope. Useful for checking TROFF page layout before typesetting. 0 COL Canonicalize files with reverse line feeds for one-pass printing. OO0 DEROFF Remove all TROFF commands from input. 0 CHECKEQ Check document for possible errors in EQN usage. 4. Information Handling 0 SORT Sort or merge ASCII files line-by-line. No limit on mput size. O Sort up or down. O Sort lexicographically or on numeric key. O Multiple keys located by delimiters or by character position. O May sort upper case together with lower into dictionary order. O Optionally suppress duplicate data. O TSORT O UNIQ Topological sort — converts a partial order into a total order. Collapse successive duplicate lines in a file into one line. O Publish lines that were originally unique, duplicated, or both. O May give redundancy count for each line. OTR Do one-to-one character translation according to an arbitrary code. O May coalesce selected repeated characters. O May delete selected characters. O DIFF Report line changes, additions and deletions necessary to bring two files into agreement. O May produce an editor script to convert one file into another. O A variant compares two new versions against one old one. 0 COMM Identify common lines in two sorted files. Output in up to 3 columns shows lines present in first file only, present in both, and/or present in second only. 0 JOIN Combine two files by joining records that have identical keys. 1 GREP Print all lines in a file that satisfy a pattern as used in the editor ED. 1-16 UNIX 32/V — Summary O May print all lines that fail to match. O May print count of hits. O May print first hit in each file. O LOOK Binary search in sorted file for lines with specified prefix. O WC Count the lines, “words” (blank-separated strings) and characters in a file. O SED Stream-oriented version of ED. Can perform a sequence of editing operations on each line of an input stream of unbounded length. O Lines may be selected by address or range of addresses. O Control flow and conditional testing. O Multiple output streams. O Multi-line capability. 0 AWK Pattern scanning and processing language. Searches input for patterns, and performs actions on each line of input that satisfies the pattern. O Patterns include regular expressions, arithmetic and lexicographic conditions, boolean combinations and ranges of these. O Data treated as string or numeric as appropriate. O Can break input into fields; fields are variables. O Variables and arrays (with non-numeric subscripts). O Full set of arithmetic operators and control flow. O Multiple output streams to files and pipes. O Output can be formatted as desired. O Multi-line capabilities. 5. Graphics The programs in this section are predominantly intended for use with Tektronix 4014 storage scopes. O GRAPH Prepares a graph of a set of input numbers. O Input scaled to fit standard plotting area. O Abscissae may be supplied automatically. O Graph may be labeled. O Control over grid style, line style, graph orientation, etc. OO0 SPLINE Provides a smooth curve through a set of points intended for GRAPH. O PLOT A set of filters for printing graphs produced by GRAPH and other programs on various terminals. printer/plotter. Filters provided for 4014, DASI terminals, Versatec 6. Novelties, Games, and Things That Didn’t Fit Anywhere Else 0 BACKGAMMON A player of modest accomplishment. BCD Converts ascii to card-image form. OO0 CAL Print a calendar of specified month and year. O CHING The I Ching. O FORTUNE Presents a random fortune cookie on each invocation. included. [0 UNITS Place your own interpretation on the output. Limited jar of cookies Convert amounts between different scales of measurement. Knows hundreds For example, how many km/sec is a parsec/megayear? of units. UNIX 32/V — Summary O ARITHMETIC Speed and accuracy test for number facts. O QUIZ Test your knowledge of Shakespeare, Presidents, capitals, etc. O WUMP Hunt the wumpus, thrilling search in a dangerous cave. [0 HANGMAN Word-guessing game. Uses a dictionary supplied with SPELL. O FISH Children’s card-guessing game. 1-17 UNIX Time-Sharing System 1-19 The UNIX Time-Sharing System* D. M. Ritchie and K. Thompson ABSTRACT UNIXT is a general-purpose, multi-user, interactive operating system for the larger Digital Equipment Corporation PDP-11 and the Interdata 8/32 com- puters. It offers a number of features seldom found even in larger operating systems, including 1 ii A hierarchical file system incorporating demountable volumes, Compatible file, device, and inter-process 1/0, iii ~ The ability to initiate asynchronous processes, iv System command language selectable on a per-user basis, \ Over 100 subsystems including a dozen languages, vi High degree of portability. This paper discusses the nature and implementation of the file system and of the user command interface. 1. INTRODUCTION There have been four versions of the UNIX time-sharing system. The earliest (circa 1969-70) ran on the Digital Equipment Corporation PDP-7 and -9 computers. The second version ran on the unprotected PDP-11/20 computer. The third incorporated multiprogramming and ran on the PDP-11/34, /40, /45, /60, and /70 computers; it is the one described in the previously published version of this paper, and is also the most widely used today. This paper describes only the fourth, current system that runs on the PDP-11/70 and the Interdata 8/32 computers. In fact, the differences among the various systems is rather small; most of the revisions made to the originally published version of this paper, aside from those concerned with style, had to do with details of the implementation of the file system. Since PDP-11 UNIX became operational in February, 1971, over 600 installations have been put into service. Most of them are engaged in applications such as computer science education, the preparation and formatting of documents and other textual material, the collection and processing of trouble data from various switching machines within the Bell System, and recording and checking telephone service orders. Our own installation is used mainly for research in operating systems, languages, computer networks, and other topics in computer science, and also for document preparation. Perhaps the most important achievement of UNIX is to demonstrate that a powerful operating system for interactive use need not be expensive either in equipment or in human effort: it can run on hardware costing as little as $40,000, and less than two man-years were spent on the main system software. We hope, however, that users find that the most * Copyright 1974, Association for Computing Machinery, Inc., reprinted by permission. This is a revised version of an article that appeared in Communications of the AcM, 17, No. 7 (July 1974), pp. 365-375. That article was a revised version of a paper presented at the Fourth AcM Symposium on Operating Systems Principles, 1BM Thomas J. Watson Research Center, Yorktown Heights, New York, October 15-17, 1973. T UNIX is a trademark of Bell Laboratories. 1-20 UNIX Time-Sharing System important characteristics of the system are its simplicity, elegance, and ease of use. Besides the opei‘ating system proper, some major programs available under UNIX are C compiler Text editor based on QED! Assembler, linking loader, symbolic debugger Phototypesetting and equation setting programs?3 Dozens of languages including Fortran 77, Basic, Snobol, APL, Algol 68, M6, TMG, Pascal There is a host of maintenance, utility, recreation and novelty programs, all written locally. The UNIX user community, which numbers in the thousands, has contributed many more programs and languages. It is worth noting that the system is totally self-supporting. All UNIX software is maintained on the system; likewise, this paper and all other documents in this issue were generated and formatted by the UNIX editor and text formatting programs. II. HARDWARE AND SOFTWARE ENVIRONMENT The PDP-11/70 on which the Research UNIX system is installed is a 16-bit word (8-bit byte) computer with 768K bytes of core memory; the system kernel occupies 90K bytes about equally divided between code and data tables. This system, however, includes a very large number of device drivers and enjoys a generous allotment of space for I/0 buffers and system tables; a minimal system capable of running the software mentioned above can require as little as 96K bytes of core altogether. There are even larger installations; see the description of the PWB/UNIX systems,*3 for example. There are also much smaller, though somewhat restricted, versions of the system.? Our own PDP-11 has two 200-Mb moving-head disks for file system storage and swapping. There are 20 variable-speed communications interfaces attached to 300- and 1200-baud data sets, and an additional 12 communication lines hard-wired to 9600-baud terminals and satellite computers. There are also several 2400- and 4800-baud synchronous communication interfaces used for machine-to-machine file transfer. Finally, there is a variety of miscellaneous devices including nine-track magnetic tape, a line printer, a voice synthesizer, a phototypesetter, a digital switching network, and a chess machine. The preponderance of UNIX software is written in the abovementioned C language.’ Early versions of the operating system were written in assembly language, but during the summer of 1973, it was rewritten in C. than that of the old. The size of the new system was about one-third greater Since the new system not only became much easier to understand and to modify but also included many functional improvements, including multiprogramming and the ability to share reentrant code among several user programs, we consider this increase in size quite acceptable. III. THE FILE SYSTEM The most important role of the system is to provide a file system. From the point of view of the user, there are three kinds of files: ordinary disk files, directories, and special files. 3.1 Ordinary files A file contains whatever information the user places on it, for example, symbolic or binary (object) programs. No particular structuring is expected by the system. A file of text consists simply of a string of characters, with lines demarcated by the newline character. Binary programs are sequences of words as they will appear in core memory when the program starts executing. A few user programs manipulate files with more structure; for example, the assembler generates, and the loader expects, an object file in a particular format. How- ever, the structure of files is controlled by the programs that use them, not by the system. UNIX Time-Sharing System 1-23 3.6 I/0 calls The system calls to do I/0 are designed to eliminate the differences between the various devices and styles of access. There is no distinction between “random” and “sequential” 1/0, nor is any logical record size imposed by the system. The size of an ordinary file is determined by the number of bytes written on it; no predetermination of the size of a file is necessary or possible. To illustrate the essentials of I/0, some of the basic calls are summarized below in an anonymous language that will indicate the required parameters without getting into the underlying complexities. Each call to the system may potentially result in an error return, which for simplicity is not represented in the calling sequence. To read or write a file assumed to exist already, it must be opened by the following call: filep = open (name, flag) where name indicates the name of the file. An arbitrary path name may be given. The flag argument indicates whether the file is to be read, written, or “updated,” that is, read and written simultaneously. The returned value filep is called a file descriptor. It is a small integer used to identify the file in subsequent calls to read, write, or otherwise manipulate the file. To create a new file or completely rewrite an old one, there is a create system call that creates the given file if it does not exist, or truncates it to zero length if it does exist; create also opens the new file for writing and, like open, returns a file descriptor. The file system maintains no locks visible to the user, nor is there any restriction on the number of users who may have a file open for reading or writing. Although it is possible for the contents of a file to become scrambled when two users write on it simultaneously, in practice difficulties do not arise. We take the view that locks are neither necessary nor sufficient, in our environment, to prevent interference between users of the same file. They are unnecessary because we are not faced with large, single-file data bases maintained by independent processes. They are insufficient because locks in the ordinary sense, whereby one user is prevented from writing on a file that another user is reading, cannot prevent confusion when, for example, both users are editing a file with an editor that makes a copy of the file being edited. There are, however, sufficient internal interlocks to maintain the logical consistency of the file system when two users engage simultaneously in activities such as writing on the same file, creating files in the same directory, or deleting each other’s open files. Except as indicated below, reading and writing are sequential. This means that if a par- ticular byte in the file was the last byte written (or read), the next I/O call implicitly refers to the immediately following byte. For each open file there is a pointer, maintained inside the system, that indicates the next byte to be read or written. If n bytes are read or written, the pointer advances by n bytes. Once a file is open, the following calls may be used: n = read (filep, buffer, count) n = write (filep, buffer, count) Up to count bytes are transmitted between the file specified by filep and the byte array specified by buffer. The returned value n is the number of bytes actually transmitted. In the write case, n is the same as count except under exceptional conditions, such as 1/0 errors or end of physical medium on special files; in a read, however, n may without error be less than count. If the read pointer is so near the end of the file that reading count characters would cause reading beyond the end, only sufficient bytes are transmitted to reach the end of the file; also, typewriter-like terminals never return more than one line of input. When a read call returns with n equal to zero, the end of the file has been reached. For disk files this occurs when the read pointer becomes equal to the current size of the file. It is possible 1-24 UNIX Time-Sharing System to generate an end-of-file from a terminal by use of an escape sequence that depends on the device used. Bytes written affect only those parts of a file implied by the position of the write pointer and the count; no other part of the file is changed. If the last byte lies beyond the end of the file, the file is made to grow as needed. To do random (direct-access) I/0O it is only necessary to move the read or write pointer to the appropriate location in the file. location = Iseek (filep, offset, base) The pointer associated with filep is moved to a position offset bytes from the beginning of the file, from the current position of the pointer, or from the end of the file, depending on base. offset may be negative. are ignored. For some devices (e.g., paper tape and terminals) seek calls The actual offset from the beginning of the file to which the pointer was moved is returned in location. There are several additional system entries having to do with I/O and with the file system that will not be discussed. For example: close a file, get the status of a file, change the protection mode or the owner of a file, create a directory, make a link to an existing file, delete a file. IV. IMPLEMENTATION OF THE FILE SYSTEM As mentioned in Section 3.2 above, a directory entry contains only a name for the associ- ated file and a pointer to the file itself. index number) of the file. This pointer is an integer called the i-number (for When the file is accessed, its i-number is used as an index into a system table (the i-list) stored in a known part of the device on which the directory resides. The entry found thereby (the file’s i-node) contains the description of the file: 1 the user and group-ID of its owner il its protection bits ili the physical disk or tape addresses for the file contents v its size \ time of creation, last use, and last modification vi the number of links to the file, that is, the number of times it appears in a directory vii a code indicating whether the file is a directory, an ordinary file, or a special file. | The purpose of an open or create system call is to turn the path name given by the user into an i-number by searching the explicitly or implicitly named directories. Once a file is open, its device, i-number, and read/write pointer are stored in a system table indexed by the file descriptor returned by the open or create. Thus, during a subsequent call to read or write the file, the descriptor may be easily related to the information necessary to access the file. When a new file is created, an i-node is allocated for it and a directory entry is made that contains the name of the file and the i-node number. Making a link to an existing file involves creating a directory entry with the new name, copying the i-number from the original file entry, and incrementing the link-count field of the i-node. Removing (deleting) a file is done by decrementing the link-count of the i-node specified by its directory entry and erasing the directory entry. If the link-count drops to 0, any disk blocks in the file are freed and the i-node is de-allocated. The space on all disks that contain a file system is divided into a number of 512-byte blocks logically addressed from O up to a limit that depends on the device. the i-node of each file for 13 device addresses. There is space in For nonspecial files, the first 10 device addresses point at the first 10 blocks of the file. If the file is larger than 10 blocks, the 11 devk] ice address pointsto an indirect block containing up to 128 addresses of additional blocks in the file. Still larger files use the twelfth device address of the i-node to point to a double- 1-26 UNIX Time-Sharing System charged to the second user. The simplest reasonably fair algorithm seems to be to spread the charges equally among users who have links to a file. Many installations avoid the issue by not charging any fees at all. An image is a computer execution environment. It includes a memory image, general register values, status of open files, current directory and the like. An image is the current state of a pseudo-computer. A process is the execution of an image. While the processor is executing on behalf of a process, the image must reside in main memory; during the execution of other processes it remains in main memory unless the appearance of an active, higher-priority process forces it to be swapped out to the disk. The user-memory part of an image is divided into three logical segments. text segment begins at location 0 in the virtual address space. The program During execution, this segment 1s write-protected and a single copy of it is shared among all processes executing the same program. At the first hardware protection byte boundary above the program text segment in the virtual address space begins a non-shared, writable data segment, the size of which may be extended by a system call. Starting at the highest address in the virtual address space is a stack segment, which automatically grows downward as the stack pointer fluctuates. 5.1 Processes Except while the system is bootstrapping itself into operation, a new process can come into existence only by use of the fork system call: processid = fork () When fork is executed, the process splits into two independently executing processes. The two processes have independent copies of the original memory image, and share all open files. The new processes differ only in that one is considered the parent process: in the parent, the returned processid actually identifies the child process and is never 0, while in the child, the returned value is always 0. Because the values returned by fork in the parent and child process are distinguishable, each process may determine whether it is the parent or child. 5.2 Pipes Processes may communicate with related processes using the same system read and write calls that are used for file-system I/O. The call: filep = pipe () returns a file descriptor filep and creates an inter-process channel called a pipe. This chan- nel, like other open files, is passed from parent to child process in the image by the fork call. A read using a pipe file descriptor waits until another process writes using the file descriptor for the same pipe. At this point, data are passed between the images of the two processes. Neither process need know that a pipe, rather than an ordinary file, is involved. Although inter-process communication via pipes is a quite valuable tool (see Section 6.2), it is not a completely general mechanism, because the pipe must be set up by a common ancestor of the processes involved. 5.3 Execution of programs execute (file, arg,, argy, ... , ar g Another major system primitive is invoked by which requests the system to read in and execute the program named by file, passing it string UNIX Time-Sharing System 1-27 arguments arg,, argy, ..., arg,. All the code and data in the process invoking execute is replaced from the file, but open files, current directory, and inter-process relationships are unaltered. Only if the call fails, for example because file could not be found or because its execute-permission bit was not set, does a return take place from the execute primitive; it resembles a “jump” machine instruction rather than a subroutine call. 5.4 Process synchronization Another process control system call: processid = wait (status) causes its caller to suspend execution until one of its children has completed execution. Then walit returns the processid of the terminated process. An error return is taken if the calling process has no descendants. Certain status from the child process is also available. 5.5 Termination Lastly: exit (status) terminates a process, destroys its image, closes its open files, and generally obliterates it. The parent is notified through the wait primitive, and status is made available to it. Processes may also terminate as a result of various illegal actions or user-generated signals (Section VII below). . ‘ | VI. THE SHELL For most users, communication with the system is carried on with the aid of a program called the shell. The shell is a command-line interpreter: it reads lines typed by the user and interprets them as requests to execute other programs. (The shell is described fully else- where,? so this section will discuss only the theory of its operation.) In simplest form, a command line consists of the command name followed by arguments to the command, all separated by spaces: | command arg, arg, ... arg_ The shell splits up the command name and the arguments into separate strings. Then a file ith vy 72 4% BYR PR €2 PR 4 oht £ TV T iy 188 with name command is sought; command may be a path name including the £ “/” /95 character to specify any filein the system. If commandis found, it is brought into memory and executed. The arguments collected by the shell are accessible to the command. When the com- mand is finished, the shell resumes its own execution, and indicates its readiness to accept another command by typing a prompt character. If file command cannot be found, the shell generally prefixes a string such as /bin/ to command and attempts again to find the file. Directory /bin contains commands intended to be generally used. (The sequence of directories to be searched may be changed by user request.) 6.1 Standard I/0 messages typed by the user read this file. A o CT‘ 3 —-— - The discussion of I/0 in Section eems to imply that every file used by a program must be opened or created by he program in order to get a file descriptor for the file. Programs executed by the shell, however, start off with three open files with file descriptors 0, 1, and 2. As such a program begins execution, file 1 is open for writing, and is best understood as the standard output file. Except under circumstances indicated below, this file is the user’s terminal. Thus programs that wish to write informative information ordinarily use file descriptor 1. Conversely, file 0 starts off open for reading, and programs that wish to read 1-28 UNIX Time-Sharing System The shell is able to change the standard assignments of these file descriptors from the user’s terminal printer and keyboard. If one of the arguments to a command is prefixed by “>” file descriptor 1 will, for the duration of the command, refer to the file named after the “>”_ For example: 33 Is ordinarily lists, on the typewriter, the names of the files in the current directory. The command: ls >there creates a file called there and places the listing there. Thus the argument >there means “place output on there.” On the other hand: ed ordinarily enters the editor, which takes requests from the user via his keyboard. The com- mand ed <script interprets script as a file of editor commands; thus <script means “take input from script.” Although the file name following “<” or “>" appears to be an argument to the com- mand, in fact it is interpreted completely by the shell and is not passed to the command at all. Thus no special coding to handle I/O redirection is needed within each command; the command need merely use the standard file descriptors 0 and 1 where appropriate. File descriptor 2 is, like file 1, ordinarily associated with the terminal output stream. When an output-diversion request with ‘“>" is specified, file 2 remains attached to the terminal, so that commands may produce diagnostic messages that do not silently end up in the output file. 6.2 Filters An extension of the standard I/O notion is used to direct output from one command to the input of another. A sequence of commands separated by vertical bars causes the shell to execute all the commands simultaneously and to arrange that the standard output of each command be delivered to the standard input of the next command in the sequence. Thus in the command line: o Is | pr -2 | opr Is lists the names of the files in the current directory; its output is passed to pr, which paginates its input with dated headings. (The argument “-2” requests double-column output.) Likewise, the output from pr is input to opr; this command spools its input onto a file for off-line printing. This procedure could have been carried out more clumsily by: Is >templ pr -2 <templ >temp?2 opr <temp?2 followed by removal of the temporary files. In the absence of the ability to redirect output and input, a still clumsier method would have been to require the s command to accept user requests to paginate its output, to print in multi-column format, and to arrange that its out- put be delivered off-line. Actually it would be surprising, and in fact unwise for efficiency rea- sons, to expect authors of commands such as Is to provide such a wide variety of output options. A program such as pr which copies its standard input to its standard output (with processing) is called a filter. Some filters that we have found useful perform character UNIX Time-Sharing System 1-29 transliteration, selection of lines according to a pattern, sorting of the input, and encryption | and decryption. 6.3 Command separators; multitasking Another feature provided by the shell is relatively straightforward. Commands need not be on different lines; instead they may be separated by semicolons: Is; ed will first list the contents of the current directory, then enter the editor. A related feature is more interesting. If a command is followed by “&,” the shell will not wait for the command to finish before prompting again; instead, it is ready immediately to accept a new command. For example: as source >output & causes source to be assembled, with diagnostic output going to output; no matter how long the assembly takes, the shell returns immediately. When the shell does not wait for the completion of a command, the identification number of the process running that command is printed. This identification may be used to wait for the completion of the command or to terminate it. The “&” may be used several times in a line: as source >output & lIs >files & does both the assembly and the listing in the background. In these examples, an output file other than the terminal was provided; if this had not been done, the outputs of the various commands would have been intermingled. The shell also allows parentheses in the above operations. For example: (date; Is) >x & writes the current date and time followed by a list of the current directory onto the file x. The shell also returns immediately for another request. 6.4 The shell as a command; command files The shell is itself a command, and may be called recursively. Suppose file tryout con- tains the lines: as source mv a.out testprog testprog The mv command causes the file a.out to be renamed testprog. a.out is the (binary) output of the assembler, ready to be executed. Thus if the three lines above were typed on the keyboard, source would be assembled, the resulting program renamed testprog, and testprog executed. When the lines are in tryout, the command: sh <tryout would cause the shell sh to execute the commands sequentially. Mha ocha l] thaa 4 LT OIiATIL 114D irthor canahiliting JLu.l VILT L bayauuluco, incliidine thao o]«flitv to Substit u te parameters and to LLIVIUWLELEG UVIAC dRJiLi y construct argument lists from a specified subset of the file names in a directory. It also provides general conditional and looping constructions. 6.5 Implementation of the shell The outline of the operation of the shell can now be understood. Most of the time, the shell is waiting for the user to type a command. When the newline character ending the line is typed, the shell’s read call returns. The shell analyzes the command line, putting the arguments in a form appropriate for execute. Then fork is called. The child process, whose 1-30 UNIX Time-Sharing System code of course is still that of the shell, attempts to perform an execute with the appropriate arguments. If successful, this will bring in and start execution of the program whose name was given. Meanwhile, the other process resulting from the fork, which is the parent process, waits for the child process to die. When this happens, the shell knows the command is finished, so it types its prompt and reads the keyboard to obtain another command. Given this framework, the implementation of background processes is trivial; whenever a command line contains “&,” the shell merely refrains from waiting for the process that it created to execute the command. Happily, all of this mechanism meshes very nicely with the notion of standard input and output files. When a process is created by the fork primitive, it inherits not only the memory image of its parent but also all the files currently open in its parent, including those with file descriptors 0, 1, and 2. The shell, of course, uses these files to read command lines and to write its prompts and diagnostics, and in the ordinary case its children—the command programs—inherit them automatically. When an argument with “<” or “>" is given, however, the offspring process, just before it performs execute, makes the standard I/O file descriptor (0 or 1, respectively) refer to the named file. This is easy because, by agreement, the smallest unused file descriptor is assigned when a new file is opened (or created); it is only necessary to close file O (or 1) and open the named file. Because the process in which the command pro- gram runs simply terminates when it is through, the association between a file specified after “<” or “>" and file descriptor 0 or 1 is ended automatically when the process dies. Therefore the shell need not know the actual names of the files that are its own standard input and out- put, because it need never reopen them. Filters are straightforward extensions of standard I/O redirection with pipes used instead of files. In ordinary circumstances, the main loop of the shell never terminates. (The main loop includes the branch of the return from fork belonging to the parent process; that is, the branch that does a wait, then reads another command line.) The one thing that causes the shell to terminate is discovering an end-of-file condition on its input file. Thus, when the shell is executed as a command with a given input file, as in: sh <comfile the commands in comfile will be executed until the end of comfile is reached; then the instance of the shell invoked by sh will terminate. Because this shell process is the child of another instance of the shell, the wait executed in the latter will return, and another command may then be processed. 6.6 Initialization The instances of the shell to which users type commands are themselves children of another process. The last step in the initialization of the system is the creation of a single process and the invocation (via execute) of a program called init. create one process for each terminal channel. The role of init is to The various subinstances of init open the appropriate terminals for input and output on files 0, 1, and 2, waiting, if necessary, for carrier to be established on dial-up lines. in. Then a message is typed out requesting that the user log When the user types a name or other identification, the appropriate instance of init wakes up, receives the log-in line, and reads a password file. If the user’s name is found, and if he is able to supply the correct password, init changes to the user’s default current directory, sets the process’s user ID to that of the person logging in, and performs an execute of the shell. At this point, the shell is ready to receive commands and the logging-in protocol is complete. Meanwhile, the mainstream path of init (the parent of all the subinstances of itself that will later become shells) does a wait. If one of the child processes terminates, either because a shell found an end of file or because a user typed an incorrect name or password, this path of init simply recreates the defunct process, which in turn reopens the appropriate input and output files and types another log-in message. Thus a user may log out simply by typing the UNIX Time-Sharing System 1-31 end-of-file sequence to the shell. 6.7 Other programs as shell The shell as described above is designed to allow users full access to the facilities of the system, because it will invoke the execution of any program with appropriate protection mode. Sometimes, however, a different interface to the system 1s desirable, and this feature is easily arranged for. Recall that after a user has successfully logged in by suppiying a name and password, init ordinarily invokes the shell to interpret command lines. The user’s entry in the password file may contain the name of a program to be invoked after log-in instead of the shell. This program is free to interpret the user’s messages in any way it wishes. For example, the password file entries for users of a secretarial editing system might specify that the editor ed is to be used instead of the shell. Thus when users of the editing system log in, they are inside the editor and can begin work immediately; also, they can be prevented from invoking programs not intended for their use. In practice, it has proved desirable to allow a temporary escape from the editor to execute the formatting program and other utilities. Several of the games (e.g., chess, blackjack, 3D tic-tac-toe) available on the system illustrate a much more severely restricted environment. For each of these, an entry exists in the password file specifying that the appropriate game-playing program is to be invoked instead of the shell. People who log in as a player of one of these games find themselves limited to the game and unable to investigate the (presumably more interesting) offerings of the UNIX system as a whole. VII. TRAPS The PDP-11 hardware detects a number of program faults, such as references to nonexistent memory, unimplemented instructions, and odd addresses used where an even address is required. Such faults cause the processor to trap to a system routine. Unless other arrangements have been made, an illegal action causes the system to terminate the process and to write its image on file core in the current directory. A debugger can be used to determine the state of the program at the time of the fault. Programs that are looping, that produce unwanted output, or about which the user has second thoughts may be halted by the use of the interrupt signal, which is generated by typing the “delete” character. Unless special action has been taken, this signal simply causes the program to cease execution without producing a core file. There is also a quit signal used to force an image file to be produced. Thus programs that loop unexpectedly may be halted and the remains inspected without prearrangement. The hardware-generated faults and the interrupt and quit signals can, by request, be either ignored or caught by a process. For example, the shell ignores quits to prevent a quit from logging the user out. The editor catches interrupts and returns to its command level. This is useful for stopping long printouts without losing work in progress (the editor manipu- lates a copy of the file it is editing). In systems without floating-point hardware, unimplemented instructions are caught and floating-point instructions are interpreted. VIII. PERSPECTIVE Perhaps paradoxically, the success of the UNIX system is largely due to the fact that it was not designed to meet any predefined objectives. The first version was written when one of us (Thompson), dissatisfied with the available computer facilities, discovered a little-used PDP-7 and set out to create a more hospitable environment. This (essentially personal) effort was sufficiently successful to gain the interest of the other author and several colleagues, and iater to justify the acqmsztmn of the PDP-11/20, %p@mficahy to support a text editing and forvn, the system had proved useful enough ing system. When in turn the 11/20 was out 1-32 UNIX Time-Sharing System to persuade management to invest in the PDP-11/45, and later in the PDP-11/70 and Interdata 8/32 machines, upon which it developed to its present form. Our goals throughout the effort, when articulated at all, have always been to build a comfortable relationship with the machine and to explore ideas and inventions in operating systems and other software. We have not been faced with the need to satisfy someone else’s requirements, and for this freedom we are grateful. Three considerations that influenced the design of UNIX are visible in retrospect. First: because we are programmers, we naturally designed the system to make it easy to write, test, and run programs. The most important expression of our desire for programming convenience was that the system was arranged for interactive use, even though the original version only supported one user. We believe that a properly designed interactive system is much more productive and satisfying to use than a “batch” system. Moreover, such a system is rather easily adaptable to noninteractive use, while the converse is not true. Second: there have always been fairly severe size constraints on the system and its software. Given the partially antagonistic desires for reasonable efficiency and expressive power, the size constraint has encouraged not only economy, but also a certain elegance of design. This may be a thinly disguised version of the “salvation through suffering” philoso- phy, but in our case it worked. Third: nearly from the start, the system was able to, and did, maintain itself. is more important than it might seem. This fact If designers of a system are forced to use that system, they quickly become aware of its functional and superficial deficiencies and are strongly motivated to correct them before it is too late. Because all source programs were always avail- able and easily modified on-line, we were willing to revise and rewrite the system and its software when new ideas were invented, discovered, or suggested by others. The aspects of UNIX discussed in this paper exhibit clearly at least the first two of these design considerations. The interface to the file system, for example, is extremely convenient from a programming standpoint. The lowest possible interface level is designed to eliminate distinctions between the various devices and files and between direct and sequential access. No large “access method” routines are required to insulate the programmer from the system calls; in fact, all user programs either call the system directly or use a small library program, less than a page long, that buffers a number of characters and reads or writes them all at once. Another important aspect of programming convenience is that there are no “control blocks” with a complicated structure partially maintained by and depended on by the file system or other system calls. Generally speaking, the contents of a program’s address space are the property of the program, and we have tried to avoid placing restrictions on the data structures within that address space. Given the requirement that all programs should be usable with any file or device as input or output, it is also desirable to push device-dependent considerations into the operating system itself. The only alternatives seem to be to load, with all programs, routines for dealing with each device, which is expensive in space, or to depend on some means of dynamically linking to the routine appropriate to each device when it is actually needed, which is expen- sive either in overhead or in hardware. Likewise, the process-control scheme and the command interface have proved both convenient and eflicient. Because the shell operates as an ordinary, swappable user program, it consumes no “wired-down” space in the system proper, and it may be made as powerful as desired at little cost. In particular, given the framework in which the shell executes as a pro- cess that spawns other processes to perform commands, the notions of I/O redirection, background processes, command files, and user-selectable system interfaces all become essentially trivial to implement. UNIX Time-Sharing System 1-33 Influences The success of UNIX lies not so much in new inventions but rather in the full exploitation of a carefully selected set of fertile ideas, and especially in showing that they can be keys to the implementation of a small yet powerful operating system. The fork operation, essentially as we implemented it, was present in the GENIE time- sharing system.” On a number of points we were influenced by Multics, which suggested the particular form of the I/O system calls® and both the name of the shell and its general functions. The notion that the shell should create a process for each command was also suggested to us by the early design of Multics, although in that system it was later dropped for efliciency reasons. A similar scheme is used by TENEX.? IX. STATISTICS The following numbers are presented to suggest the scale of the Research UNIX operation. Those of our users not involved in document preparation tend to use the system for program development, especially language work. There are few important “applications” programs. Overall, we have today: 125 user population 33 maximum simultaneous users 1,630 directories 28,300 files 301,700 512-byte secondary storage blocks used There is a “background” process that runs at the lowest possible priority; it is used to soak up any idle CPU time. It has been used to produce a million-digit approximation to the constant e, and other semi-infinite problems. Not counting this background work, we average daily: 13,500 commands 9.6 CPU hours 230 connect hours 62 different users 240 log-ins X. ACKNOWLEDGMENTS The contributors to UNIX are, in the traditional but here especially apposite phrase, too numerous to mention. Certainly, collective salutes are due to our colleagues in the Computing Science Research Center. R. H. Canaday contributed much to the basic design of the file sys- tem. We are particularly appreciative of the inventiveness, thoughtful criticism, and constant support of R. Morris, M. D. Mcllroy, and J. F. Ossanna. References 1. L. P. Deutsch and B. W. Lampson, “An online editor,” Comm. Assoc. Comp. Mach., vol. 10, no. 12, pp. 793-799, 803, December 1967. 2. B. W. Kernighan and L. L. Cherry, “A System for Typesetting Mathematics,” Comm. Assoc. Comp. Mach., vol. 18, pp. 151-157, Bell Laboratories, Murray Hill, New Jersey, March 1975. 3. This issue, B. W. Kernighan, M. E. Lesk, and J. F. Ossanna, “UNIX Time-Sharing System: Document Preparation,” Bell Sys. Tech. J., vol. 57, no. 6, pp. 2115-2135, 1978. 1-34 UNIX Time-Sharing System 4. T. A. Dolotta and J. R. Mashey, “An Introduction to the Programmer’s Workbench,” Proc. 2nd Int. Conf. on Software Engineering, pp. 164-168, October 13-15, 1976. B. W. Kernighan and D. M. Ritchie, The C Programming Language, Prentice-Hall, Englewood Cliffs, New Jersey, 1978. Aleph-null, “Computer Recreations,” Software Practice and Experience, vol. 1, no. 2, pp. 201-204, April-June 1971. | L. P. Deutsch and B. W. Lampson, “SDS 930 time-sharing system preliminary reference manual,” Doc. 30.10.10, Project GENIE, Univ. Cal. at Berkeley, April 1965. R. J. Feiertag and E. I. Organick, “The Multics input-output system,” Proc. Third Symposium on Operating Systems Principles, pp. 35-41, October 18-20, 1971. D. G. Bobrow, J. D. Burchfiel, D. L. Murphy, and R. S. Tomlinson, “TENEX, a Paged Time Sharing System for the PDP-10,” Comm. Assoc. Comp. Mach., vol. 15, no. 3, pp. 135-143, March 1972. Introduction 2-1 PART 2: GETTING STARTED The following four articles will help you begin using the ULTRIX-32 system quickly and productively. “UNIX for Beginners,” by Kernighan, is for all beginners; it’s essential. Be sure to read this article before going on to anything else in the ULTRIX-32 system. The article on matl comes next in importance, since the mail utility lets you exchange messages with other people using the system. And the articles on the bc and dc desk calculator utilities will get you started using some of the interactive math capabilities of the ULTRIX-32 system. UNIX for Beginners This article explains ULTRIX-32 system concepts and tells how to use the major features of the software system. If you want to get going fast, log in to an ULTRIX-32 system, and experiment with the commands shown in the examples as you read along. The article introduces: e Using dial-up and hard-wired terminals to communicate with ULTRIX-32 (UNIX) Logging in o Using simple commands and command options o (Creating, printing, and displaying files e Listing directory contents e Finding your way through directory hierarchies » Using scripts to automate command sequences o Redirecting process output to files instead of to a terminal o Using pipes to coordinate and combine tasks o Using the text formatting packages o Preparing a bibliography » Searching files for a character string e Programming in C and other languages: guidelines While not up-to-date, the UNIX reading list supplied at the end of the article is useful; many of the items referenced are included in this document set. NOTE ULTRIX-32 implements some commands explained in the article. Specifically: CTRL/C CTRL/U <delete character> differently from The default interrupt command. The default delete line command. The default delete command. the ways 2-2 Introduction The “Mail Reference Manual,” by Shoens, offers a tutorial format, like “UNIX for Beginners.” It tells you how to use each feature of the mail utility, including: e Sending and receiving messages o Saving or disposing of old messages e Maintaining message folders e Leaving and reentering the mail utility in the middle of a job o Sending mail across a network o Using aliases to simplify message distribution In addition, the article on mail is a complete reference manual. It defines all mail commands, custom options, command-line options, and the standard message format. Mail is the default mailer for C Shell users. Desk Calculator Utilities ULTRIX-32 offers two desk calculator utilities: bc and dc. Both utilities can take input from the keyboard and from program files, and both perform mathematical functions. Bc is easier to use than dc, however, because it operates at a higher programming level than dc. BC allows you to enter data and commands in a conventional format similar to the formats of BASIC and C. The article entitled “BC - An Arbitrary Precision Desk-Calculator Language,” by Cherry and Morris, gives rules for using bc and some good examples. It explains bc: Math capabilities Precision capabilities Function definition and use One dimensional arrays Flow control Operator symbols consistent with C Library functions for trigonometry, logarithms, exponentiation, and Bessel functions “DC - An Interactive Desk Calculator,” also by Cherry and Morris, lists the rules and functions of the dc utility, but examples are few. The article explains the use of a push-down stack for calculations and data manipulation. Only data stored on the stack is available for operations. The authors list commands and programming features and explain the internal representation and manipulation of numbers. The be utility is layered on dec: dc interprets the output of the bc compiler. This relationship is transparent to users, but significant if you are choosing between the two utilities. Bc is the practical choice for most users, because it really does resemble a desk calculator; dc is closer to an assembly language than a calculator, and as such it is a tool for sophisticated users. UNIX For Beginners 2-3 UNIX For Beginners — Second Edition Brian W. Kernighan Bell Laboratories Murray Hill, New Jersey 07974 3. Document INTRODUCTION operating svsiem s easy (0 learn and use. and presents few of :he usual impediments (0 getting the job done. [t is hard. beginner to know where however, for manu- formatting tools. 4. Writing Programs: UNIX is an excellent sys- tem for developing programs. make the best use of the facilities available. The purpose of this introduction is t0 nelp new users ge: used to the main ideas of the UNIX system and start making effective use of it quickly. T1his section talks about some of the toois. but again is not a tutorial in any of the programming languages provided by the system. LW You should have a couple of other docu- Preparing but not extensive instructions on any of the the (0 start, and how to0 Preparation: scripts is one of the most cormmon uses (or UNIX systems. This section contains advice. From the user’s point of view, the UNIX . A UNIX Reading List. Aa annotated bibliography of documents that new users should be aware of. ments with vou for easy reference as you read this one. The most imporant-is The UNVIX Programmer’s Manuail', it’s often easier to teil you to read about something in the manual than to repeat its contents here. I. GETTING STARTED The other useful docu- ment is 4 Turorial [nroduction 0 the UNIX Texr Editor. which will tell you how to use the aditor Logging In 0 get text -= programs, data, documents = into vou can get from whoever administers your sys- the computer. tem. A word of warning: the UNIX system has become quite popular. and there are severai major variants in widespread use. Of course details aiso change with time. So aithough the basic structure of UNIX and how (o use it is common (o all versions, there will certainly be a few things which are different on your system from what is described here. We have tried to minimeize the problem, but be aware of it. [n cases of doubt, this paper describes Version 7 UNIX. This paper has five sections: . Getting Started: How o log in. how to type. what to do about mistakes in typing, how to log out. Some of this is dependent on which system vou log into (phone aumbers, for exampie) and what terminai you use, so this section must necessarily be supplemented by local information. o &b You must have 3 UNIX login name. which Day-to-day Use: Things vou need avery day to use the system affectively: generally useful commands: the file system. You also need to know the phone number. uniess your system uses permanently connected terminais. The UNIX systerm is capable of deal- ing with a wide variety of terminals: Terminet 300°s: Execuport. Tl and similar portabies: video (CRT) terminals like the HP2640, etc.. mighpriced graohics terminals like the Tektronix 3014: plotting terminals like those from GSI and DASI: and even the venerable Telewype in its forms. But note: UNIX is strongly various oriented towards devices with jower case. If vour terminal produces only upper case (e.2.. model 33 Teletype. some video and portable terminals). life will be so difficult that vou should look for another terminal. Be sure to set the switches appropriately on your device. adjusted Switches include the that mignt speed. need to upper/lower be case mode, (ull duplex., even parity, and any others that local wisdom advises. Establish a connection using whatever magic is needed for your terminal: this may i(avoive dialing a1 telephone call or merely flipping a switch. In either case., UNIX should type “login:" at vou. [f it types zarbage, you may switches. De at [f that the wrong fails. push speed. check the the “"breakTM or 2-4 UNIX For Beginners “interrupt’ key a few times, slowly. If that fails to produce a login message, consult a guru. Strange Terminal Behavior RETURN. Sometimes you can get into a state where For example, each letter may be typed twice, or the RETURN may not cause a line feed or a return to the left margin. You can often fix this by logging out and logging back in. Or you can read the description of the command stty in section | of the manual. To get intelligent treatment of tab characters The culmination of your login efforts is a ““prompt character,’’ a single character that indi- doesn’t have tabs, type the command When you get a login: message, type your login name in flower case. Follow it by a RETURN; the system will not do anything until you type a RETURN. I[f a password is required, you will be asked for it, and (if possible) printing will be turned off while you type it. Don't forget your terminal acts strangely. (which are much used in UNIX) if your terminal cates that the system is ready to accept com- mands from you. The prompt character is usu- ally a dollar sign $ or a percent sign %. (You may also get a message of the day just before the prompt character, or a notification that you have stty —tabs and the system will convert each tab into the right number of blanks for you. If your terminal does have computer-settable tabs, the command mail.) tabs will set the stops correctly for you. Typing Commands Mistakes In Typing Once you've seen the prompt character, you can type commands, which are requests that the system do something. Try typing date followed by RETURN. You should get back something like If you make a typing mistake, and see it before RETURN has been typed, there are two ways (o recover. The sharp-character # erases the last character typed; in fact successive uses of # erase characters back to the beginning of the line (but not beyond). So if you type badly, you can correct as you go: dd#atte# #e Mon Jan 16 14:17:10 EST 1978 Don’t forget the RETURN after the command, or nothing will happen. If you think you're being ignored, type a RETURN; something should hap- pen. RETURN won't be mentioned again, but don’t forget it — it has to be there at the end of each line. Another command you might try is whe, which tells you everyone who is currently logged in: is the same as date. The at-sign @ erases all of the characters typed so far on the current input line, so if the line is irretrievably fouled up, type an @ and start the line over. What if you must enter a sharp or at-sign as part of the text? If you precede either # or @ by a backslash \, it loses its erase meaning. So to enter a sharp or at-sign in something, type \# or \@. The system will always echo a newline at who you after your at-sign, even if preceded by a backslash. Don't worry — the at-sign has been gives something like mb ski gam tty0k tty05 ttyll Janm 16 Jan 16 Jan 16 09:11 (09:33 13:07 The time is when the user logged in; “‘ttyxx" is the system's idea of what terminal the user is on. If you make a mistake typing the command recorded. To erase a backslash, you have to type two sharps or two at-signs, as in \##. The backslash is used extensively in UNIX to indicate that the following character is in some way special. Read-ahead name, and refer to a non-existent command, you will be told. For example, if you type whom vou will be told whom: not found Of course, if you inadvertently type the name of some other command, it will run, with more or less mysterious results. UNIX has full read-ahead, which means that you can type as fast as you want, whenever you want, even you. [f you type during output, your input char- when acters will appear characters, but some command intermixed with they wili be interpreted in the correct order. several is stored typing at the output away and So you can type commands one after another without waiting for the first to finish or even begin. UNIX For Beginners 2-5 seif is a handy reminder mechanism.) Stopping a Program You can stop most programs by typing the There are other ways to send mail = vou character “‘DELTM ({(perhaps called ‘‘delete’’ or “rubout’’ on your terminal). The “interrupt’” or can send a previously prepared letter, and you “break’’ key found on most terminals can also be used. [n a few programs, like the text editor, more detils see mail(l). (The notation maii(l) means the command mail in sestion | of the UNIX Programmer's Manual) DEL stops whatever the program is doing but leaves you in that program. ¢an mail t0 3 number of people ail at once. Hanging up the phone will stop most programis. Writing to other users Logging Cut message like At some point, out of the blue will come a The easiest way to log out is to hang up the ptione. You can also type Message from joe try07... accompanied by a startling beep. login [t means that Joe wants 0 talk to you., but unless wou take and let someone else use the terminal you were explicit action you won't be abie to talk hack. on. It is usually not sufficient just to turn off the respond, type the command terminal. Most For UNIX systems do not use a To write joe time-out mechanism, so you'll be there forever uniess you hang up. This establishes a two-way communication path. Now whatever Joe types oa his terminal will Mail appear on When you log in, you may sometimes get the message The path To is Normally, whatever program you are running nas to UNIX provides a postal system $0 you can come minate or be terminated. ter- If vou're editing, you can escape temporarily from the aditor = read the editor tutorial.) A protocol is neaded (0 keep what you type mail Your mail will be printed, one message at a time, most recent message first. After each message, mail waits for you to say what (o0 do with it. The two basic responses are d, which deletes the message, and RETURN, which does not (so it will from getting garbled up with what Joe iypes. Typically it’s like this: Joe types write smith and waits. Smith types write joe and waits. Joe now types his message (as many lines as he likes). When he’s ready {or a reply. still be there the next time you read your mail- he signals it by typing (o). which stands for ‘‘over’. Now Smith types a reply, also terminated by (0). This cycle repeats until someone g=t(s box). Other responses are described in the manual. (Earlier versions of mail do not process one message at 4 time, but are otherwise simi- lar.) How do you send mail 0o someone eise? Suppose it is o go 0 “‘joe’” (assuming ‘‘joe’ is someone’'s login name). The easiest way is this: ired: he then signals Ris inteat ‘o quit with (o), for "*over and out’". To terminate the conversation. 2ach side must type a ‘‘control-d’’ character alone mail joe now [ype in the ex: of the letter on a line. (**Delete’” also works.) When the on as many lines as you like ... other person types his “‘coatrol-dTM’ ? you will get the message EOF on your Ajrer the last line of the leter terminal. ype the character '‘concol=d"’, that is, hold down ‘‘conmrol’’ and ype If you write 10 someone wno isa't logged in. a leter °d"’. or who doesn't want (o be disturbed, you'll be And that's it. The ‘‘coatrol-d’’ sequence, often called “"'EQOF"" for end-of-file, is used throughout the system to mark the end of input from a tesminai, s¢ you mught as well get used to it. For practice, send mail to yourself. versa. state where you can type a command. You have mail. municate with other users of the system. read your mail, type the command yours and vice slow, rather like taiking to the moon. (If you are in the middle of something, you have 0 get to a (This isn’t as strange as it might sound = mail to one- told. [f the target is [ogged in but doesn’t answer after a decant interval, simply type ““control-d"". 2-6 UNIX For Beginners On-line Manual correcting The UNIX Programmer's Manual is typically kept on-line. If you get stuck on something, and can't find an expert to assist you, you can print graphs and the like. on your terminal some manual section name’’. Thus to read up on the who command, mistakes, rearranging para- Finally, you must write the information you have typed into a file with the editor command w: that might help. This is also useful for getting the most up-to-date information on a command. To pririt @ manual section, type ‘‘man command- spelling W ed will respond with the number of characters it wrote into the file junk. | Until the w command, nothing is stored petmanently, so if you hang up and go home the type information is lost.? But after w the information is there permanently; you can re-access it any mean who and, of course, time by typing ed junk man man Type a q command to quit the editor. tells all about the man command. (If you try to quit without writing, ed will print a ? w0 rem- Computer Alded Instruction ind you. Your UNIX system may have available a program called learm, which provides computer aided instruction on the file system and basic commands, the editor, docum nt and even C programming. A second q gets you out regardless.) Now create a second file called temp in the same manner. You should now have two files, junk and temp. preparation, Try typing the com- mand What files are out there? The Is (for *list’") command lists the names (not contents) of any of the files that UNIX learn If learn exists on your system, it will tell you knows about. If you type Is what to do from there. the response will be II. DAY-TO-DAY USE junk temp Creating Files - The Editor If you have to type a paper or a letter or a which are indeed the two files just created. The program, how do you get the information stored names in the machine? automatically, but other variations are possible. with the Most of these tasks are done UNIX ‘‘text editor’” ed. Since ed is thoroughly documented in ed(l) and explained in A Tutorial [nroduction 10 the UNIX Text Editor, are sorted into alphabetical order For example, the command Is =¢ we won't spend any time here describing how to causes the files to be listed in the order in which use it. they were last changed, most recent first. All we want it for right now is to make some files. (A file is just a collection of information stored in the machine, a simplistic but ade- quate definition.) To create a file called junk with some text in The -1 option gives a ‘‘long"’ listing: Is =1 will produce something like it, do the following: =tw—rw=—rw— 1 bwk 41 Jul 22 2:56 junk ed junk a =tw—rw—rw=— 1 bwk 78 Jul 22 2:57 temp (invokes the text editor) (command to “*ed”, to add text) now (ype in file. The 41 and 78 are the number of characters whatever text you want ... (signals the end of adding text) The **."" that signals the end of adding text must be at the beginning of a line by itself. Don’t for- get it, for until it is typed, no other ed commands will be recognized — everything you type will be treated as text to be added. At this point you can operations on the The date and time are of the last change to the text you do (which should agree with the numbers you got from ed). typed in, editing such as The =rw—rw—rw— tells who has permission to read and write the file, in this case everyone. t This is not strictly true = if you hang up while editing, the data you various bwk is the owner of the file, that is, the person who created it. were working on is saved in a file called ed.hup, which you can continue with at your next session. UNIX For Beginners Options can be combined: Is =—It gives the same thing as ls =, but sorted into time order. You can also name the {les you're interested in, and 1s will list the information about them oaly. More details can be found in Is(l). The use of optional arguments that begin with 2 minus sign, like =t and -—It. is a com- mon coavention for UNIX programs. [n general, if a program accepts such optional arguments, they precede any filename arguments. [t is also vital that you separate the various arguments with spaces: Is=—1 is not the same as Is =L Printing Files Now that you've got a file of text, how do you print it so people can look at it? There are a host of programs that do that, probably more than are nesded. One simple thing is to use the editor, since orinting is often done just before making changes anyway. You can say pr =3 junk prints junk in 3-column format. You can use any reasonable number in place of 3" and pr will do its best. see pr(l). ed will reply with the count of the characers in junk and then print all the lines in the file. After you learn how to use the editor, you can be selective about the parts you print. There are times when it's not feasible to use the editor for printing. For example, there is a limit on how big a file ed can handle (several thousand lines). Secondly, it will only print one file at a time, and sometimes you want to print several, one after another. So here are a couple of alternatives. First is cat, the simplest of all the printing programs. cat simply prints on the terminal the contents of all the files named in a list. Thus cat junk prinits one file, and cat junk temp prints two. The files are simply concatenated (hence the name ‘‘cat’’) onto the terminal. pe produces formatied printouts of files. As with cat, pr prints all the files named in a list The difference is that it produces headings with date, time, page aumber and file name at the top of each page, and extra lines to skip over the fold in the paper. Thus. pr junk temp will print junk neatly, then skip to the top of 2 new page and print temp neatly. pr can also produce muliti-column output: pr has other capabiiities as well. [t shouid be noted that pr is nora formatung program in the sense of shufiling lines around and justifying margins. The true {ormatters are nroff and troff. which we will get to in the section on document preparation. There are also programs that print files on 2 high-speed printer. Look in your manual under opr and Ilpr. Which to use depends on what aquipment is attached (0 vour macnuae. Shuffling Files About Now that vou have some files in the file sys- tern and some experience in printng them, vou can try bigger things. For example. vou can move a file from one place to another (wiich amounts to giving it a new name), like this: ed junk 1,3p 2-7 mv junk precious This means that what used to be “‘junk’” is now “srecious’’. If you do an ls command aow. you will get precious temp Beware that if you move 1 file to anocher one that already exists, the already existing contents | are lost forever. If you want to make a copy of 2 file (that is. (0 have two versions of something), you can use the ¢p command: ¢p precious templ makes a dupiicate copy of precious in templ. Finally. when you get tired of creating and moving files, there is a command (o remove files from the file system. called rm. rm temp templ will remove both of the files named. You will get a warning message if one of the named files wasn’t there, but otherwise rm. like most UNIX commands, does its work silently. There is no prompting or chatter, and 2rror mes- sages are occasionally curt. This terseness is sometimes disconcsrting (0 newcomers, out experienced users find it desirable. What’s in a2 Filename So far we have used filenimes without ever saying what's a legal name, so it's time for a couple of rules. Firsi. filenimes are limited to 14 characters, which is eniow,h to be descripuve, 2-8 UNIX For Beginners Second. although you can use almost any character in a filename, common sense savs you should stick to ones that are visible, and that you shoulid probably avoid characters that might be used with other meanings. We have already seen, for Is chap® produces chapl.l chapl.2 chapl.3 example, that in the ls command. ls —¢ means to list in time order. So if you had a file whose name was —t, you would have a tough time listing it by name. Besides the minus sign, there are other characters which have special meaning. To avoid pitfalls, you would do well to use only letters, numbers and the period until you're familiar with the situation. On to some more positive suggestions. Suppose you're typing a large document like a book. Logically this divides into many small pieces, like chapters and perhaps sections. Physically it must be divided too, for ed will not handle really big files. Thus you should type the document as a ev® The ® is not limited to the last position in a filename — it can be anywhere and can occur several times. Thus rm *junk® ®temp® removes all files that contain junk or temp as any part of their name. pr * prints all your files (alphabetical order), and rm * number of files. You might have a separate file for each chapter, called As a special case, ® by itself matches every filename, so removes all files. (You had betier be very sure that's what you wanted to say!) chapl The chap? elc... Or, if each chapter were broken into several files, you might have * is not the only pattern-matching feature available. Suppose you want to print only chapters | through 4 and 9. Then you can say pr chapl12349[® chapl.1 chapl.2 The [...] means to match any of the characters chapl.3 inside the brackeis. A range of consecutive letters or digits can be abbreviated, so you can also do this with chapl.1 chap2.2 pe chapll —49]® LR X ] You can now tell at a glance where a particular file fits into the whole. | There are advantages to a systematic naming convention which are not obvious to the novice UNIX user. What if you wanted to print the whole book? You could say but you would get tired pretty fast, and would probably even make mistakes. Fortunately, there is a shortcut. You can say pr chap® The ® means ‘‘anything at all,” so this translates “‘print all files whose names begin with chap’’, listed in alphabetical order. This shorthand notation is not a property of the pr command, by the way. It is system-wide, a service of the mands program that interprets com- (the ‘‘shell,”” sh(l)). YAl cam I cee A JUW Lall Y i1 A ¥ the book: matches any character in the range a through z. The ? pattern maiches any single character, SO Is ? lists all files which have single-character names, pr chapl.l chapl.2 chapl.3 ...... into Letters can also be used within brackets: la=2] Using that fact. e Hsf& the nameec nf the filfie BRA i (S ¥ 4 a8 %4 dw JiGA L8 i wed IR L8 4% L25% et ~and is =1 chap?.1 lists information about the first file of each chapter (chapl.1, chap2.1, etc.). Of these niceties, ® is certainly the most useful, and you shouid get used to it. The others are frills, but worth knowing. If you should ever have to turn off the special meaning of ®, ?, etc., enclose the entire argument in single quotes, as in Es l?fl We'll see some more examples of this shortly. UNIX For Beginners You should get a response something like this What’s in 2 Filename, Continued When you first made that file called junk, how did the system know that there (although again the details may be differeat): wasn't bin another junk somewhere else, especiaily since dey the person in the next office is also reading this tutorial? The answer is that generally each user et{c lib Ras a private directory, which contains only the files that belong o0 him. When you log in. you are *‘in’" your directory. Unless vou take special action, when you create a new file, it is made in the directory that you are currently in; this is most often your owa directory, and thus the file, is unrelated to any other file of the same name tmp - use This is a collection of the basic directories of files that the system knows about, we are at¢ the root of the tree, Now try that might &xist in someone else’s directory. The set of all files is organized into a (usually big) tres, with your files located several branches into the tree. [t is possibie for you to “walk’ around this tree, and to fad any fle in the system, by starting at the root of the tree and walking along the proper set of branches. Con- versely, you can start where you are and walk toward the root - Let's try the latter first. The basic tools is the command pwd (“‘print working directory’’), which prints the name of the directory you are currently in. cat /use/your-name/junk (if junk is still around in your directory). the system you are on, if you give the comrmand pwd, it will print something like /usr/your-name/junk i$ called the pathname of the fle that vou normally think of as *“*junk’’. ‘*‘Pathname’ has an obvicus meaning. it represents the {ull name of the path you have to lar file. from in turn in Here is 31':1 /ust on your system, you will get something idam the name of a directory, it lists the contears of that directory. Next, try s /usr This should print a long series of names, among which S your own login name vour-name. On many systems, usr is a directory that coritains the directories of all the normal users of the sys- usr dev N eve ma / ]\ / \ \ junk temp [f you now type lists the contents of the current directory; given may N , read on.) as you get from a plain Is: with no arguments, s (root) / 1\ /1N / \ A Make the corresponding changes and tem, like you. et¢ NN (Evea if it's not called you should get axactly the same list of file names | a picture which make this clearer: the directory s /usr/your-pame root It is a universal rule in the UNIX system /usr, which is in turn in the root directory called by convention just /. the that anywhere you can use an ordinary fllename, This says that you are currently in the directory analogous. follow througfl the tree of directories to get 1o a particu- /user/your-name which i$ The name you can use a pathname. Although the details will vary according (o your-name, 2-9 ump T S junmk Notice that Marv’s junk is unrelated to Eve's. This isn't too exciting if all the files of interest are in your own directory., but i vou work with someone zise or ‘on several projects concurrently, it becomes handy indesd. For example, your friends can print your book b5y saying pe /usr/your-name/chap® Sirnilarly, you can find out what files your newgnbor has by saying Is /usr/neighbor-name or make your own copy of one of his files by cp /usr/your-neighbor/his-{ile yourfile If your neighbor doesn’: want you poking around in his files, or vice versa, privacy can be 2-10 UNIX For Beginners arranged. Each file and directory has read-writeexecute permissions for the owner, a group, and The first command removes all files from the everyone else, which can be set 10 control access. tory. See 1s(1) and chmod(l) for details. As a matter of observed fact, most users most of the time find openness of more benefit than privacy. directory. the second removes the empty direc- You can go up one level in the tree of files by saying Cd X As a final experiment with pathnames, try e ls /bin /usr/bin . 9% 18 the name of the parent of whatever direc- tory you are currently in. Do some of the names look familiar? When you run a program, by typing its name after the prompt character, the system simply looks for a file of that name. [t normally looks first in your directory (where it typically doesn’t find it), then in /bin and finally in /usr/bin. There is nothing magic about commands like cat or ls, except that they have been collected into a couple of places to be easy to find and administer. For completeness, **."”" is an alternate name for the directory you are in. Using Files instead of the Terminal Most of the commands we have seen so far produce output on the terminal, some, like the editor, also take their input from the terminal. It is universal in UNIX systems that the terminal can be replaced by a file for either or both of input and output. What if you work regularly with someone As one example, Is else on common information in his directory? You could just log in as your friend each time makes a list of files on your terminal. But if you you want to, but you can also say ‘‘l want to say work on his files instead of my own’. done by changing the directory This is that you are Is > filelist a list of your files will be placed in the file filelist currently in: (which will be created if it doesn’t already exist, cd /usr/your-friend (On some systems, cd is spelled chdir.) Now when you use a filename in something like cat or pr, it refers to the file in your friend's directory. Changing directories doesa’t affect any permis- sions associated with a file = if you couldn't access a file from your own directory, changing to another directory won't alter that fact. or overwritten if it does). The symbol > means **put the output on the following file, rather than on the terminal.”” Nothing is produced on the terminal. cat fl f2 3 >temp Of course, if you forget what directory you're in, type As another example, you could coms- bine several files into one by capturing the output of cat in a file: The symbol > > operates very much like > does, except that it means ‘‘add to the end of.” That is, pwd cat f1 2 3 > >temp to find out. It is usually convenient to arrange your own means to concatenate f1, f2 and {3 to the end of files so that all the files related to one thing are whatever is already in temp, instead of overwrit- in a directory separate. from other projects. For example, when you write your book, you might ing the existing contents. want to keep all the text in a directory called book. So make one with As with >, if temp doesn’t exist, it will be created for you. In a similar way, the symbol < means to take the input for a program from the following file, instead of from the terminal. Thus, you could make up a script of commonly used editing commands and put them into a file cailed seript. Then you can run the script on a file by saying then start typing chapters. The found in (presumably) /usr/your-name/book To remove the directory book, type rm book/*® rmdir book book is ed file <script now As another example, you can use ed to prepare a letter in file let, then send it to several people with mail adam eve mary joe <let UNIX For Beginners 2-11 Pipes The Shell One of the novel coatributions of the UNIX sysitem is the idea of a pipe. A pipe is simply a the mysterious “‘sheil.”” whnich is in fact sh(l). We have already menuoned once or twice way to connec: the output of one program (o the The shell is the program ihat interpre:s what you input of another program. so the (wo run as a sequence of processes - a pipeline. after translating ®, etc., into lists of flenames, For example, type as commands and arguments. [t aiso looks and <, >. and | into changes of input and output streams. prfgh The shell has other capabilities t0o. For will print the files f, g, and h, beginning each on example, you can run two programs with one a new page. Suppose you want together instead. You could say command line by separating the commands with them run a sermicolon:; the shell recognizes the semicolon and breaks the line into wo commands. Thus cat fg h >temp date: who pr <temp PR temp does both commands before ceturning with but this is more work than necessary. Clearly what we want i (0 take the output of cat and connect it o the input of pr. So let us use a pipe: cacfg hlpe The vertical bar | means to take the output from cat, which would normally have gone to the ter- You can also have more than one program cunning simultaneously if vou wish. For example, if you are doing something time-consuming, like the editor script of an earlier section, and vyou don't want to wait around for the results before starting something eise, you can say ed file <script & minal, and put it into pr to be neatly formatted. There are many other examples of pipes. For example, Is | pr =3 prints a list of your files in three columns. The program we counts the number of lines, words and characters in its input, and as we saw earlier, who prints a list of currently-logged on people, The ampersand at the end of a command line says ‘°start this command running, then take further commands f(rom the terminal immediately,’”” that is, don't wait for it 0 complete. Thus the script will begin. but vou can do something else at the same time. Of course, (0 keep the output from interfering with what you're doing on the terminal. it would be better to say one per line. Thus ed file <script >script.@ut & who | we which tells how many people are logged on. 2 prompt character. And of course saves the output lines in a file called script.out. When you initiate 3 command with &. the system replies with 2 number called the process Is | we number. which identifies the command in case counts your files. you later want to stop it. [f vou do, you can say Any program that reads from the terminal ‘can read from a pipe instead; any program that writes on the terminal can drive a pipe. You can If you forget the process number. the command have as many elements in a pipeline as you wish. ps will tell you about everything vou nave run- Many UNIX programs are written so that they will take their input from one or more files if file arguments are given:; if no arguments are given they will read {rom the termiinal, and thus can be used in pipelines. pr is one example: pr =3 abe prints files a, b and ¢ in order in three columns. But in [ ] catabecipr =3 pr prints the information coming down the pipeline, still in three columns. kill process-number ning. (If you are desperate, kill0 will kiil all your processes.) And if you're curious about other people, ps a will tell you about a#f pro- grams that are currently running. You can say (command-1: command-2; command-3) & 1o start thres commands in the background. or you can start a background pipeline with command-1 | command-2 % Just as you can tell the editor or some simi- 2-12 UNIX For Beginners lar program to take its input from a file instead of from the terminal, you can tell the shell to read a file to get commands. (Why not? The shell, after all, is just a program, albeit a clever one.) For instance, suppose you want to set tabs on your terminal, and find out the date and who's on the system every time you log in. Because nroff and troff are relatively hard to learn to use effectively, several *‘packages’’ of canned formatting requests are available to let you specily paragraphs, running titles, footnotes, multi-column output, and so on, with little effort and without having to learn nroff and troff. These packages take a modest effort to learn, but Then you can put the three necessary commands the rewards for using them are so great that it is (tabs, date, who) into a file, let’s call it startup, time well spent. and then run it with In this section, we will provide a hasty look at the **manuscript’” package known as sh startup This says to run the shell with the file startup as input. The effect is as if you had typed the contents of startup on the terminal. and two upper-case letters, such as .TL, which is used to introduce a title, or .PP to begin a new paragraph. If this is to be a regular thing, you can eliminate the need to type sh: simply type, once only, the command A document is typed so it looks something like this: .TL chmod +x startup title of document AU and thereafter you need only say author name startup to run the SH sequence of commands. section heading The PP chmod(l) command marks the file executable; the shell recognizes this and runs it as paragraph ... a PP sequence of commands. If you want to the shell in the section on programming. [1I. DOCUMENT PREPARATION UNIX systems are used extensively for docu- ment preparation. another paragraph ... startup to run automatically every time you log in, create a file in your login directory called .profile, and place in it the line startup. When the shell first gains control when you log in, it looks for the .profile file and does whatever commands it finds in it. We'll get back There are two major format- ting programs, that is, programs that produce a text with justified right margins, automatic page numbering and titling, automatic and the like. —ms. Formatting requests typically consist of a period hyphenation, nroff is designed to produce output SH another section heading PP etc. The lines that begin with a period are the formatting requests. For example, .PP calls for starting a new paragraph. The precise meaning of .PP depends on what output device is being used (typesetter or terminal, for instance), and on what publication the document will appear in. For example, —ms normally assumes that a paragraph is preceded by a space (one line in nroff, "2 line in troff), and the first word is indented. These rules can be changed if you like, but they are changed by changing the on terminals and line-printers. troff (pronounced ‘‘tee-roff’’) instead drives a photo- interpretation of .PP, not by re-typing the docu- typesetter, which produces very high quality out- To actually produce a document in standard format using —ms, use the command put on photographic paper. This paper was for- matted with troff. troff —ms files ... Formatting Packages for the typesetter, and The basic idea of nroff and troff is that the text to be formatted contains within it ‘‘format- (ing commands’’ that indicate in detail how the formatted text is to look. ment. For example, there might be commands that specify how long lines are, wnetwner to use singie or Yl doupie spacing, and o o S Rl Nl | oA IQQ what running titles 1o use on each page. nroff —ms files ... for a terminal. The —ms argument tells troff and aroff to use the manuscript package of formatting requests. T There are cavusEa ) several similar packages; chec with a local expert to determine which ones are in common use on your machine. UNIX For Beginners Supporting T ools we counts the words, lines and characters in [n addition to the basic formatters, thers is a host of supporting programs that help with docu- ment preparation. 1ne list in the next few paragraphs is far {rom complete, so browse through a set of files. tr translates characters into other characters; for example it will convert upper (o lower case and vice versa. This translates upper into lower: the manual and check with people around you tr A=Z a—z <input >output for other possibilities. eqn and neqn le¢ you integrate mathernatics into the text of a document, in an easy-io-learn language that closely resembles the way you would speak it aloud. For example, the egn input | sum from (=0 tou x sub i " =" pi over 2 por 2 necessary to align complicated ¢columns with elements of varying widths. refer prepares hibliographic citations from a data base, in whatever style is defined by the formatting package. It looks after all the details of aumbering refecences in sequence, flling in page and volume numbers, getting the author’s initials ang the journal aame right, and so on. spell and typo detect possible spelling misspell works by comparing in your document to a dictionary, printing those that are not in the dictionary. It knows enougi about English spelling to detect plurals and the like, so it does a very good ob. typo looks {or words which are ‘‘unusual’, and Spelling mistakes tend to be more unusual, and thus show up early when the maost unusual words are printed first. grep looks through a set of files for lines that contain a particular text pattern (rather like the editor’s context search does, but on a bunch of files). For example, grep 'ingd chap® will find all lines that end with the letters ing in the files chap®. ice (0 put (It is almost always a good prac- single -'.-‘.l them to arbitrarcily long inputs. quotes awk provides the ability to do both pattern matching and numesic computations, and to conveniently procass feids lines. These users. and programs are for they not limited are more (o Put them on your list of things to learn about. comiputations takes in 2 document. sed provides many of the editing facilities of ed, dut can apply document preparation. The program tbl provides an analogous service for preparing tabular material; it does all the prints those. index (keyword-in-context listing). advanced ” words sort sorts flles in a variety of ways: cref makes cross-references: pex makes a permuted within sroduces the output the 2-13 around 0 B oam om0 am wd a2 s ood the Bl patiern mnem 0D o emam you're searching for, in case it contains characters like ® or 3 that have a special meaning to the sheil.) grep is oftea useful for finding out in whichi of a set of files the misspeiled words detected by spell are actually located. diff prints 2 list of the differences between two files, so vou can compare wo versioas of something automaticaily (which certainly beats proofreading by hand). Most of these programs are zither indepen- dently documented (like eqn and tbl), or are sufficiently simple that the description in the UNIX Programmer's Manua( is adequate sxplanation. Hints for Preparing Documents Most documents go through several versions (always more than you expected) before they are finally finished. Accoedingly, you should do whatever possible (0 make the job of changing them easy. First, when you do the purely mechanical tvpe so that subsegue:at editing will be easy. Start sach sentence ona a new line. Make lines short, and Sreak lines at operations of typing, natural places. such as after commas ind semicolons. rather than randomly. change documents by Since most people rewriting phrases aind adding, deleting and rearranging sentencss, these precautions simplify any editing vou have 10 do later. Keep the individual files of 2 document down (o modest size, perhaps ten to fifteen thousand characiers. Larger fles edit more slowly, and of course if you make a dumb mistake it's better 0 have clobbered a small file than a big one. Split into files at natural boundaries in the document, {or the same reasons that you suart eactl seatencs on a riew line. The second aspect of making change 2asy (s t0 not commit vourself to formatting decails (00 early. One of the advantages of formatting oack- ages like —ms is that they permit vou to delay decisions to the last possible moment. until 2 document s decided whether it will printer. printed. it 1§ [ndaed. not aven be typeset or sut on a iine 2-14 UNIX For Beginners As a rule of thumb, for all but the most trivial jobs, you should type a document in terms commands (using the global commands of ed), and write it into script. of a set of requests like .PP, and then define ed <script them appropriately, either by using one of the canned packages (the better way) or by defining your own nroff and troff commands. As long as Now the command will produce the same output as the laborious hand typing. Alternately (and more easily), you you have entered the text in some systematic can use the fact that the shell will perform loops, way, repeating a set of commands over and over again it can always be cleaned up and re- formatted by a judicious combination of editing for a set of arguments: commands and request definitions. for | in chap® do There will be no attempt made to teach any of the programming languages available but a few words of advice are in order. One of the reasons why the UNIX system is a productive programming eavironment is that there is already a rich set of tools available, and facilities like pipes, 1/0 redirection, and the capabilities of the shell often make it possible to do a job by pasting together programs that instead of writing from scratch. already exist This sets the shell variable { to each file name in turn, then does the command. You can type this command at the terminal, or put it in a file for later execution. Programming the Shell An option often overlooked by newcomers is that the shell is itself a programming language, with variables, control flow (if-else, while, for, case), subroutines, and interrupt handling. Since The Shell The pipe mechanism lets you fabricate quite complicated operations out of spare parts that already exist. For example, the first draft of the spell program was (roughly) collect the files tF .o put each word on a new line 4 delete puncuation, . g sort into dictionary order unig discard duplicates comm print-words in text the UNIX Shell by S. R. Bourne. Programming in C If you are undertaking anything substantial, C is the only reasonable choice of programming this goes a long way for such a small effort. special programs language: everything in the UNIX system is tuned to it. The system itself is written in C, as are most of the programs that run on it. The editor can be made to do things that require by ptecing together some of the building blocks with shell command files. ples and rules can be found in 4An /neroduction to More pieces have been added subsequently, but normally there are many building-block programs, you can sometimes avoid writing a new program merely We will not go into any details here; exam- cat ... bug not in dictionary would ed $t <script done on It is also a easy language to use once you get started. C is introduced and fully described in The C Program- other systems. For example, to list the first and last lines of each of a set of files, such as a book, Ritchie (Prentice-Hall, 1978). you could laboriously type of the manual describe the system interfaces, Several sections that is, how you do I/O and similar functions. ed Read e chapl.l UNIX Programming for more complicated things. Ip Most input and output in C is best handled Sp with the standard [/O library, which provides a e chapl.2 set of [/O functions that exist in compatible form on most machines that have C compilers. In general, it's wisest to confine the system Ip Sp etc. But you can do the job much more easily. One way is (o type interactions in a program to the facilities provided by this library. C programs that don’t depend too much on Is chap® > temp to get the list of filenames into a file. ming Language by B. W. Kernighan and D. M. special features of UNIX (such as pipes) can be Then edit this file to make the necessary series of editing moved to other computers pilers. The list of such machines grows daily: in addition to the ornginal that PDP-11, have it C com- currently 2-16 UNIX For Beginners most formatting situations. If this specific pack- age isn't available on your system, something similar probably is. The most likely alternative is the PWB/UNIX macro package --mm; see your local guru if you use PWB/UNIX. B. W. Kernighan and L. L. Cherry, ‘““A System for Typesetting Mathematics,’’ Bell Laboratories Computing Science Tech. Rep. 17. M. E. Lesk, *“Tbl — A Program to Format Tables,” Bell Laboratories CSTR 49, 1976. J. F. QOssanna, Jr., “NROFF/TROFF User’s Manual,” Bell Laboratories CSTR 54, 1976. troff is the basic formatter used by —ms, eqn and thl The reference manual is indispensable if you are going to write or maintain these or similar programs. But start with: B. W. Kernighan, ‘A TROFF Tutorial,”” Bell Laboratories, 1976. An attempt to unravel the intricacies of troff. Programming: B. W. Kernighan and D. M. Ritchie, The C Programming Language, Prentice-Hall, 1978. Contains a tutorial introduction, complete discussions of all language features, and the reference manual. B. W. Kernighan and D. M. Ritchie, ‘““UNIX Programming,”’ Bell Laboratories, 1978. Describes how to interface with the system from C programs: /0 calls, signals, processes. S. R. Bourne, ‘‘An Introduction to the UNIX Shell,”” Bell Laboratories, 1978. An introduction and reference manual for the Version 7 shell. Mandatory reading if you intend to make effective use of the programming power of this shell. S. C. Johnson, “‘Yacc — Yet Another CompilerCompiler,” Bell Laboratories CSTR 32, 1978. M. E. Lesk, ““Lex — A Lexical Analyzer Gen| erator,’”’ Bell Laboratories CSTR 39, 1975. S. C. Johnson, ‘“‘Lint, a C Program Checker,” Bell Laboratories CSTR 65, 1977. S. 1. Feldman, “MAKE — A Program for Maintaining Computer Programs,’’ Bell Laboratories CSTR 57, 1977. J. F. Maranzano and S. R. Bourne, ‘““A Tutorial Introduction to ADB,”" Bell Laboratories CSTR 62, 1977. An introduction to a powerful but complex debugging tool. S. 1. Feldman and P. J. Weinberger, ‘A Portable Fortran 77 Compiler,’”’ Bell Laboratories, A full Fortran 77 for UNIX systems. 1978. Mail Reference Manual 2-17 MAIL REFERENCE MANUAL Kurt Shoens Revised by Craig Leres Version 2.18 1. Introduction Mail provides a simple and friendly environment for sending and receiving mail. It divides incoming mail into its constituent messages and allows the user to deal with them in any order. In addition, it provides a set of ed-like commands for manipulating messages and sending mail. Mail offers the user simple editing capabilities to ease the composition of outgoing messages, as well as providing the ability to define and send to names which address groups of users. Finally, Mail is able to send and receive messages across such networks as | the ARPANET, UUCP, and Berkeley network. This document describes how to use the Mail program to send and receive messages. The reader is not assumed to be familiar with other message handling systems, but should be familiar with the UNIX! shell, the text editor, and some of the common UNIX commands. “The UNIX Programmer’s Manual,” “An Introduction to Csh,” and “Text Editing with Ex and Vi” can be consulted for more information on these topics. Here is how messages are handled: the mail system accepts incoming messages for you from other people and collects them in a file, called your system mailbox. When you login, the system notifies you if there are any messages waiting in your system mailbox. If you are a csh user, you will be notified when new mail arrives if you inform the shell of the location of your mailbox. On version 7 systems, your system mailbox is located in the directory /usr/spool/mail in a file with your login name. If your login name is “sam,” then you can make csh notify you of new mail by including the following line in your .cshrc file: set mail=/usr/spool/mail/sam When you read your mail using Mail, it reads your system mailbox and separates that file into the individual messages that have been sent to you. You can then read, reply to, delete, or save these messages. Each message is marked with its author and the date they sent it. 2-18 2. Mail Reference Manual Common usage The Mail command has two distinct usages, according to whether one wants to send or receive mail. Sending mail is simple: to send a message to a user whose login name is, say, “root,” use the shell command: % Mail root then type your message. When you reach the end of the message, type an EOT (control—d) at the beginning of a line, which will cause Mail to echo “EOT” and return you to the Shell. When the user you sent mail to next logs in, he will receive the message: You have mail. to alert him to the existence of your message. If, while you are composing the message you decide that you do not wish to send it after all, you can abort the letter with a RUBOUT. Typing a single RUBOUT causes Mail to print (Interrupt -- one more to kill letter) Typing a second RUBOUT causes Mail to save your partial letter on the file “dead.letter” in your home directory and abort the letter. way to undo the act, so be careful. Once you have sent mail to someone, there is no The message your recipient reads will consist of the message you typed, preceded by a line telling who sent the message (your login name) and the date and time it was sent. If you want to send the same message to several other people, you can list their login names on the command line. Thus, % Mail sam bob john Thuition fees are due next Friday. Don’t forget!! <Control—d> EOT %0 will send the reminder to sam, bob, and john. If, when you log in, you see the message, You have mail. you can read the mail by typing simply: % Mail Mail will respond by typing its version number and date and then listing the messages you have waiting. Then it will type a prompt and await your command. The messages are assigned numbers starting with 1 — you refer to the messages with these numbers. Mail keeps tack of which messages are new (have been sent since you last read your mail) and read (have been read by you). New messages have an N next to them in the header listing and old, but unread messages have a U next to them. Mail keeps track of new/old and read/unread messages by putting a header field called “Status” into your messages. ply t. To look at a specific message, use the type command, which may be abbreviated to simFor example, if you had the following messages: N 1root Wed Sep 21 09:21 N 2 sam Tue Sep 20 22:55 "Tuition fees” you could examine the first message by giving the command: type 1 w & b . which might cause Mail to respond with, for example: Mlessage 1: Mail Reference Manual 2-19 From root Wed Sep 21 09:21:45 1978 Subject: Tuition fees Status: R Tuition fees are due next Wednesday. Don’t forget!! Many Mail commands that operate on messages take a message number as an argument like the type command. For these commands, there is a notion of a current message. When you enter the Mail program, the current message is initially the first one. Thus, you can often omit the message number and use, for example, t to type the current message. As a further shorthand, you can type a message by simply giving its message number. Hence, 1 would type the first message. Frequently, it is useful to read the messages in your mailbox in order, one after another. You can read the next message in Mail by simply typing a newline. As a special case, you can type a newline as your first command to Mail to type the first message. If, after typing a message, you wish to immediately send a reply, you can do so with the reply command. Reply, like type, takes a message number as an argument. Mail then begins a message addressed to the user who sent you the message. You may then type in your letter in reply, followed by a <control-d> at the beginning of a line, as before. Mail will type EOT, then type the ampersand prompt to indicate its readiness to accept another command. In our example, if, after typing the first message, you wished to reply to it, you might give the command: reply Mail responds by typing: To: root Subject: Re: Tuition fees and waiting for you to enter your letter. You are now in the message collection mode described at the beginning of this section and Mail will gather up your message up to a control—d. Note that it copies the subject header from the original message. This is useful in that correspondence about a particular matter will tend to retain the same subject heading, making it easy to recognize. If there are other header fields in the message, the information found will also be used. For example, if the letter had a “To:” header listing several recipients, Mail would arrange to send your replay to the same people as well. Similarly, if the original message contained a “Cc:” (carbon copies to) field, Mail would send your reply to those users, too. Mail is careful, though, not too send the message to you, even if you appear in the “To:” or “Ce:” field, unless you ask to be included explicitly. See section 4 for more details. fter typing in your letter, the dialog with Mail might look like the following: reply To: root Subject: Tuition fees Thanks for the reminder EOT & T 2-20 Mail Reference Manual The reply command is especially useful for sustaining extended conversations over the message system, with other “listening’ users receiving copies of the conversation. The reply command can be abbreviated to r. Sometimes you will receive a message that has been sent to several people and wish to reply only to the person who sent it. Reply with a capital R replies to a message, but sends a copy to the sender only. If you wish, while reading your mail, to send a message to someone, but not as a reply to one of your messages, you can send the message directly with the mail command, which takes as arguments the names of the recipients you wish to send to. For example, to send a message to “frank,” you would do: mail frank This is to confirm our meeting next Friday at 4. EOT & The mail command can be abbreviated to m. Normally, each message you receive is saved in the file mbox in your login directory at the time you leave Mail. Often, however, you will not want to save a particular message you have received because it is only of passing interest. To avoid saving a message in mbox you can delete it using the delete command. In our example, | delete 1 will prevent Mail from saving message 1 (from root) in mbox. In addition to not saving deleted messages, Mail will not let you type them, either. The effect is to make the message disappear altogether, along with its number. The delete command can be abbreviated to simply d. Many features of Mail can be tailored to your liking with the set command. The set command has two forms, depending on whether you are setting a binary option or a valued option. Binary options are either on or off. For example, the “ask” option informs Mail that each time you send a message, you want it to prompt you for a subject header, to be included in the message. To set the “ask” option, you would type set ask Another useful Mail option is “hold.” Unless told otherwise, Mail moves the messages from your system mailbox to the file mbox in your home directory when you leave Mail. If you want Mail to keep your letters in the system mailbox instead, you can set the ‘“hold” option. * - Valued options are values which Mail uses to adapt to your tastes. For example, the “SHELL” option tells Mail which shell you like to use, and is specified by set SHELL=/bin/csh for example. Note that no spaces are allowed in “SHELL=/bin/csh.” A complete list of the Mail options appears in section 5. Another important valued option is “crt.” If you use a fast video terminal, you will find that when you print long messages, they fly by too quickly for you to read them. With the “crt” option, you can make Mail print any message larger than a given number of lines by sending it through the paging program more. For example, most CRT users should do: set crt=24 to paginate messages that will not fit on their screens. More prints a screenful of information, then types --MORE--. Type a space to see the next screenful. Mail Reference Manual 2-21 Another adaptation to user needs that Mail provides is that of aliases. An alias is sim- ply a name which stands for one or more real user names. Mail sent to an alias is really sent to the list of real users associated with it. For example, an alias can be defined for the members of a project, so that you can send mail to the whole project by sending mail to just a single name. The alias command in Mail defines an alias. Suppose that the users in a pro- ject are named Sam, Sally, Steve, and Susan. To define an alias called “project” for them, you would use the Mail command: alias project sam sally steve susan The alias command can also be used to provide a convenient name for someone whose user name Is inconvenient. For example, if a user named “Bob Anderson” had the login name “anderson,”” you might want to use: alias bob anderson so that you could send mail to the shorter name, “bob.” While the alias and set commands allow you to customize Mail, they have the drawback that they must be retyped each time you enter Mail. To make them more convenient to use, Mail always looks for two files when it is invoked. It first reads a system wide file “/usr/lib/Mail.rc,” then a user specific file, “.mailre,” which is found in the user’s home directory. The system wide file is maintained by the system administrator and contains set com- mands that are applicable to all users of the system. The “.mailrc” file is usually used by each user to set options the way he likes and define individual aliases. For example, my .mailrc file looks like this: set ask nosave SHELL=/bin/csh As you can see, it is possible to set many options in the same set command. The “nosave” option is described in section 5. Mail aliasing is implemented at the system-wide level by the mail delivery system sendmail. These aliases are stored in the file /usr/lib/aliases and are accessible to all users of the system. The lines in /usr/lib/aliases are of the form: alias: name, name,, name, where alias is the mailing list name and the name, are the members of the list. Long lists can be continued onto the next line by starting the next line with a space or tab. Remember that you must execute the shell command newaliases after editing /usr/lib/aliases since the delivery system uses an indexed file created by newaliases. We have seen that Mail can be invoked with command line arguments which are people to send the message to, or with no arguments to read mail. Specifying the —f flag on the com- mand line causes Mail to read messages from a file other than your system mailbox. For example, if you have a collection of messages in the file “letters” you can use Mail to read them with: | % Mail —f letters You can use all the Mail commands described in this document to examine, modify, or delete messages from your “letters” file, which will be rewritten when you leave Mail with the quit command described below. Since mail that you read is saved in the file mbox in your home directory by default, you can read mbox in your home directory by using simply % Mail —f Normally, messages that you examine using the type command are saved in the file “mbox” in your home directory if you leave Mail with the quit command described below. 11 AECFTAR If you wish to retain a message in your system mailbox you can use the preserve command to 2-22 Mail Reference Manual tell Mail to leave it there. The preserve command accepts a list of message numbers, just like type and may be abbreviated to pre. Messages in your system mailbox that you do not examine are normally retained in your system mailbox automatically. If you wish to have such a message saved in mbox without reading it, you may use the mbox command to have them so saved. For example, mbox 2 in our example would cause the second message (from sam) to be saved in mbox when the quit command is executed. Mbox is also the way to direct messages to your mbox file if you have set the “hold” option described above. Mbox can be abbreviated to mb. When you have perused all the messages of interest, you can leave Mail with the quit command, which saves the messages you have typed but not deleted in the file mbox in your login directory. Deleted messages are discarded irretrievably, and messages left untouched are preserved in your system mailbox so that you will see them the next time you type: % Mail The quit command can be abbreviated to simply q. If you wish for some reason to leave Mail quickly without altering either your system mailbox or mbox, you can type the x command (short for exit), which will immediately return you to the Shell without changing anything. If, instead, you want to execute a Shell command without leaving Mail, you can type the command preceded by an exclamation point, just as in the text editor. Thus, for instance: ldate will print the current date without leaving Mail. Finally, the help command is available to print out a brief summary of the Mail commands, using only the single character command abbreviations. Mail Reference Manual 2-23 3. Maintaining folders Mail includes a simple facility for maintaining groups of messages together in folders. This section describes this facility. | To use the folder facility, you must tell Mail where you wish to keep your folders. Each folder of messages will be a single file. For convenience, all of your folders are kept in a single directory of your choosing. To tell Mail where your folder directory is, put a line of the form set folder=letters in your .mailrc file. If, as in the example above, your folder directory does not begin with a ‘/ Mail will assume that your folder directory is to be found starting from your home direc- tory. Thus, if your home directory is /usr/person the above example told Mail to find your folder directory in /usr/person/letters. Anywhere a file name is expected, you can use a folder name, preceded with ‘+.” For example, to put a message into a folder with the save command, you can use: save +classwork to save the current message in the classwork folder. If the classwork folder does not yet exist, it will be created. Note that messages which are saved with the save command are automati- cally removed from your system mailbox. In order to make a copy of a message in a folder without causing that message to be removed from your system mailbox, use the copy command, which is identical in all other respects to the save command. For example, copy +classwork copies the current message into the classwork folder and leaves a copy in your system mailbox. The folder command can be used to direct Mail to the contents of a different folder. For example, folder +classwork directs Mail to read the contents of the classwork folder. All of the commands that you can use on your system mailbox are also applicable to folders, including type, delete, and reply. To inquire which folder you are currently editing, use simply: folder To list your current set of folders, use the folders command. To start Mail reading one of your folders, you can use the —f option described in section 2. For example: % Mail —f +classwork will cause Mail to read your classwork folder without looking at your system mailbox. 2-24 4. Mail Reference Manual More about sending mail 4.1. Tilde escapes While typing in a message to be sent to others, it is often useful to be able to invoke the text editor on the partial message, print the message, execute a shell command, or do some other auxiliary function. Mail provides these capabilities through tilde escapes, which consist of a tilde (") at the beginning of a line, followed by a single character which indicates the function to be performed. For example, to print the text of the message so far, use: ~ p which will print a line of dashes, the recipients of your message, and the text of the message so far. Since Mail requires two consecutive RUBOUT’s to abort a letter, you can use a single RUBOUT to abort the output of "p or any other ~ escape without killing your letter. If you are dissatisfied with the message as it stands, you can invoke the text editor on it using the escape L & which causes the message to be copied into a temporary file and an instance of the editor to be spawned. After modifying the message to your satisfaction, write it out and quit the editor. Mail will respond by typing (continue) after which you may continue typing text which will be appended to your message, or type <control-d> to end the message. A standard text editor is provided by Mail. ride this default by setting the valued option “EDITOR” to something else. You can over- For example, you might prefer: set EDITOR=/usr/ucb/ex Many systems offer a screen editor as an alternative to the standard text editor, such as the vr editor from UC Berkeley. To use the screen, or visual editor, on your current message, you can use the escape, for \Y% “v works like “e, except that the screen editor is invoked instead. defined by Mail. A default screen editor is If it does not suit you, you can set the valued option “VISUAL” to the path name of a different editor. It is often useful to be able to include the contents of some file in your message; the escape “r filename is provided for this purpose, and causes the named file to be appended to your current mes- sage. Mail complains if the file doesn’t exist or can’t be read. If the read is successful, the number of lines and characters appended to your message is printed, after which you may continue appending text. The filename may contain shell metacharacters like * and ? which are expanded according to the conventions of your shell. As a special case of “r, the escape d reads in the file “dead.letter” in your home directory. This is often useful since Mail copies the text of your message there when you abort a message with RUBOUT. To save the current text of your message on a file you may use the "w filename escape. Mail will print out the number of lines and characters written to the file, after which Mail Reference Manual 2-25 you may continue appending text to your message. Shell metacharacters may be used in the filename, as in “r and are expanded with the conventions of your shell. If you are sending mail from within Mail’s command mode you can read a message sent to you into the message you are constructing with the escape: m 4 which will read message 4 into the current message, shifted right by one tab stop. You can name any non-deleted message, or list of messages. Messages can also be forwarded without shifting by a tab stop with “f. This is the usual way to forward a message. If, in the process of composing a message, you decide to add additional people to the list of message recipients, you can do so with the escape "t namel name?2 ... You may name as few or many additional recipients as you wish. Note that the users originally on the recipient list will still receive the message; you cannot remove someone from the recipient list with “t. If you wish, you can associate a subject with your message by using the escape “s Arbitrary string of text which replaces any previous subject with “Arbitrary string of text.” The subject, if given, is sent near the top of the message prefixed with “Subject:” You can see what the message will look like by using “p. For political reasons, one occasionally prefers to list certain people as recipients of car- bon copies of a message rather than direct recipients. The escape “¢c namel name2 ... adds the named people to the “Cec:” list, similar to "t. Again, you can execute "p to see what the message will look like. The recipients of the message together constitute the “To:” field, the subject the “Subject:” field, and the carbon copies the “Cc:” field. If you wish to edit these in ways impossible with the “t, s, and “c escapes, you can use the escape “h which prints “To:” followed by the current list of recipients and leaves the cursor (or print- head) at the end of the line. If you type in ordinary characters, they are appended to the end of the current list of recipients. You can also use your erase character to erase back into the list of recipients, or your kill character to erase them altogether. Thus, for example, if your erase and kill characters are the standard # and @ symbols, » “h To: root kurt####bill would change the initial recipients “root kurt” to “root bill.” When you type a newline, Mail advances to the ‘“Subject:” field, where the same rules apply. Another newline brings you to the “Cc:” field, which may be edited in the same fashion. Another newline leaves you appending text to the end of your message. You can use “p to print the current text of the header fields and the body of the message. To effect a temporary escape to the shell, the escape “lcommand is used, which executes command and returns you to mailing mode without altering the text of your message. If you wish, instead, to filter the body of your message through a shell command, then you can use “lcommand 2-26 Mail Reference Manual which pipes your message through the command and uses the output as the new text of your message. If the command produces no output, Mail assumes that something is amiss and retains the old version of your message. A frequently-used filter is the command fmt, designed to format outgoing mail. To effect a temporary escape to Mail command mode instead, you can use the “Mail command escape. This is especially useful for retyping the message you are replying to, using, for exam- ple: ~it It is also useful for setting options and modifying aliases. If you wish (for some reason) to send a message that contains a line beginning with a tilde, you must double it. Thus, for example, ““This line begins with a tilde. sends the line “This line begins with a tilde. Finally, the escape ~9 prints out a brief summary of the available tilde escapes. On some terminals (particularly ones with no lower case) tilde’s are difficult to type. Mail allows you to change the escape character with the “escape” option. For example, I set set escape=|] and use a right bracket instead of a tilde. bracket, I double it, just as for . If I ever need to send a line beginning with right Changing the escape character removes the special meaning of ~. 4.2, Network access This section describes how to send mail to people on other machines. ing to a plain login name sends mail to that person on your machine. Recall that send- If your machine is directly (or sometimes, even, indirectly) connected to the Arpanet, you can send messages to people on the Arpanet using a name of the form name@host where name is the login name of the person you're trying to reach and host is the name of the machine where he logs in on the Arpanet. If your recipient logs in on a machine connected to yours by UUCP (the Bell Laboratories supplied network that communicates over telephone lines), sending mail to him is a bit more complicated. You must know the list of machines through which your message must travel to arrive at his site. So, if his machine is directly connected to yours, you can send mail to him using the syntax: hostlname where, again, host is the name of his machine and name is his login name. If your message must go through an intermediate machine first, you must use the syntax: intermediate!host!name and so on. It is actually a feature of UUCP that the map of all the systems in the network is not known anywhere (except where people decide to write it down for convenience). Talk to 4 1 A BNS VV AR E-S ¥4 ° ® ° 1 it A2 your system administrator about the machines connected to your site. ° \ 11 Mail Reference Manual 2-27 If you want to send a message to a recipient on the Berkeley network (Berknet), you use the syntax: host:name where host is his machine name and name is his login name. Unlike UUCP, you need not know the names of the intermediate machines. When you use the reply command to respond to a letter, there is a problem of figuring out the names of the users in the “To:” and “Cec:” lists relative to the current machine. If the original letter was sent to you by someone on the local machine, then this problem does not exist, but if the message came from a remote machine, the problem must be dealt with. Mail uses a heuristic to build the correct name for each user relative to the local machine. So, when you reply to remote mail, the names in the “To:” and “Cec:” lists may change some| what. 4.3. Special recipients - As described previously, you can send mail to either user names or alias names. It is also possible to send messages directly to files or to programs, using special conventions. If a recipient name has a ‘/’ in it or begins with a ‘+’, it is assumed to be the path name of a file into which to send the message. If the file already exists, the message is appended to the end of the file. If you want to name a file in your current directory (ie, one for which a ‘/” would not usually be needed) you can precede the name with ‘./” So, to send mail to the file “memo” in the current directory, you can give the command: % Mail ./memo If the name begins with a ‘+,” it is expanded into the full path name of the folder name in your folder directory. This ability to send mail to files can be used for a variety of purposes, such as maintaining a journal and keeping a record of mail sent to a certain group of users. The second example can be done automatically by including the full pathname of the record file in the alias command for the group. the command: Using our previous alias example, you might give | alias project sam sally steve susan /usr/project/mail record Then, all mail sent to "project” would be saved on the file ‘“/usr/project/mail record” as well as being sent to the members of the project. This file can be examined using Mail —f. It is sometimes useful to send mail directly to a program, for example one might write a project billboard program and want to access it using Mail. To send messages to the billboard program, one can send mail to the special name billboard’ for example. Mail treats recipient names that begin with a I as a program to send the mail to. An alias can be set up to reference a ‘I prefaced name if desired. Caveats: the shell treats ¢ specially, so it must be quoted on the command line. Also, the \ program’ must be presented as a single argument to mail. The safest course is to surround the entire name with double quotes. usage in the alias command. would need to say: alias rmsgs | rmsgs -s” This also applies to For example, if we wanted to alias ‘rmsgs’ to ‘rmsgs —s’ we 2-28 Mail Reference Manual 5. Additional features This section describes some additional commands of use for reading your mail, setting options, and handling lists of messages. 5.1. Message lists Several Mail commands accept a list of messages as an argument. Along with type and delete, described in section 2, there is the from command, which prints the message headers associated with the message list passed to it. The from command is particularly useful in conjunction with some of the message list features described below. A message list consists of a list of message numbers, ranges, and names, separated by spaces or tabs. Message numbers may be either decimal numbers, which directly specify mes- sages, or one of the special characters “{” last relevant message, respectively. 3% €¢ “.” 3 or “$” to specify the first relevant, current, or Relevant here means, for most commands “not deleted” and “deleted” for the undelete command. A range of messages consists of two message numbers (of the form described in the previous paragraph) separated by a dash. Thus, to print the first four messages, use type 1—4 and to print all the messages from the current message to the last message, use type .—$ A name is a user name. The user names given in the message list are collected together and each message selected by other means is checked to make sure it was sent by one of the named users. If the message consists entirely of user names, then every message sent by one those users that is relevant (in the sense described earlier) is selected. Thus, to print every message sent to you by “root,” do type root As a shorthand notation, you can specify simply €% 99 to get every relevant (same sense) message. Thus, type * prints all undeleted messages, delete * deletes all undeleted messages, and undelete * ‘undeletes all deleted messages. You can search for the presence of a word in subject lines with /. For example, to print the headers of all messages that contain the word “PASCAL,” do: from /pascal Note that subject searching ignores upper/lower case differences. 5.2. List of commands This section describes all the Mail commands available when receiving mail. ! Used to preface a command to be executed by the shell. — The — command goes to the previous message and prints it. The — command may be given a decimal number n as an argument, in which case the nth previous message 1is gone to and printed. Mail Reference Manual 2-29 Print Like print, but also print out ignored header fields. See also print and ignore. Reply Note the capital R in the name. Frame a reply to a one or more messages. The reply (or replies if you are using this on multiple messages) will be sent ONLY to the person who sent you the message (respectively, the set of people who sent the messages you are replying to). You can add people using the "t and "¢ tilde escapes. The subject in your reply is formed by prefacing the subject in the original message with “Re:” unless it already began thus. If the original message included a “reply-to” header field, the reply will go only to the recipient named by “reply-to.” You type in your message using the same conventions available to you through the mail command. The Reply command is especially useful for replying to messages that were sent to enormous distribution groups when you really just want to send a message to the originator. Use it often. Type Identical to the Print command. alias Define a name to stand for a set of other names. This is used when you want to send messages to a certain group of people and want to avoid retyping their names. For example alias project john sue willie kathryn creates an alias project which expands to the four people John Sue, Willie, and Kathryn. alternates If you have accounts on several machines, you may find it convenient to use the /usr/lib/aliases on all the machines except one to direct your mail to a single account. The alternates command is used to inform Mail that each of these other addresses is really you. ally you. Alternates takes a list of user names and remembers that they are all actu- When you reply to messages that were sent to one of these alternate names, Mail will not bother to send a copy of the message to this other address (which would simply be directed back to you by the alias mechanism). ment, it lists the current set of alternate names. If alternates is given no argu- Alternates is usually used in the .mailrc file. chdir The chdir command allows you to change your current directory. Chdir takes a single argument, which is taken to be the pathname of the directory to change to. If no argu- ment is given, chdir changes to your home directory. copyThe copy command does the same thing that save does, except that it does not mark the messages it is used on for deletion when you quit. delete Deletes a list of messages. Deleted messages can be reclaimed with the undelete com- = £ mand. V% 2% 1rvant iAo anen o B “-‘r-m Lr\ wrd wna ol The dt command deletesthe current Iiessage and . Prines uwne nexiv message. for quickly reading and disposing of mail. edit To edit individual messages using the text editor, the edit command is provided. The edit command takes a list of messages as described under the type command and processes each by writing it into the file Messagex where x is the message number being edited and executing the text editor on it When you have edited the message to your 2-30 Mail Reference Manual else Marks the end of the then-part of an if statement and the beginning of the part to take effect if the condition of the if statement is false. endif Marks the end of an if statement. exit Leave Mail without updating the system mailbox or the file your were reading. Thus, if you accidentally delete several messages, you can use exit to avoid scrambling your mailbox. file The same as folder. folders List the names of the folders in your folder directory. folder The folder command switches to a new mail file or folder. you which file you are currently reading. With no arguments, it tells If you give it an argument, it will write out changes (such as deletions) you have made in the current file and read the new file. Some special conventions are recognized for the name: Name ... . Meaning . H % % name & Previous file read Your system mailbox Name’s system mailbox Your “/mbox file +folder A file in your folder directory from The from command takes a list of messages and prints out the header lines for each one; hence from joe is the easy way to display all the message headers from “joe.” headers When you start up Mail to read your mail, it lists the message headers that you have. These headers tell you who each message is from, when they were sent, how many lines and characters each message is, and the “Subject:” header field of each message, if present. In addition, Mail tags the message header of each message that has been the object of the preserve command with a “P.” Messages that have been saved or writ- ten are flagged with a “*.” Finally, deleted messages are not printed at all. If you wish to reprint the current list of message headers, you can do so with the headers command. The headers command (and thus the initial header listing) only lists the first so many message headers. minal. The number of headers listed depends on the speed of your ter- This can be overridden by specifying the number of headers you want with the window option. Mail maintains a notion of the current “window” into your messages for the purposes of printing headers. Use the z command to move forward and back a win- dow. You can move Mail’s notion of the current window directly to a particular message by using, for example, headers 40 to move Mail’s attention to the messages around message 40. can be abbreviated to h. The headers command help Print a brief and usually out of date help message about the commands in Mail. to this manual instead. Refer Mail Reference Manual 2-31 hold Arrange to hold a list of messages in the system mailbox, instead of moving them to the file mbox in your home directory. If you set the binary option hold, this will happen by default. if Commands in your ¢ “.mailrc” file can be executed conditionally depending on whether you are sending or receiving mail with the if command. For example, you can do: if receive commands... endif An else form is also available: if send commands... else commands... endif Note that the only allowed conditions are receive and send. ignore Add the list of header fields named to the ignore list. Header fields in the ignore list are not printed on your terminal when you print a message. This allows you to suppress printing of certain machine-generated header fields, such as Via which are not usually of interest. The Type and Print commands can be used to print a message in its entirety, including ignored fields. If ignore is executed with no arguments, it lists the current set of ignored fields. list List the vaild Mail commands. mail Send mail to one or more people. If you have the ask option set, Mail will prompt you for a subject to your message. Then you can type in your message, using tilde escapes as described in section 4 to edit, print, or modify your message. To signal your satisfaction with the message and send it, type control-d at the beginning of a line, or a . alone on a line if you set the option dot. To abort the message, type two interrupt characters (RUBOUT by default) in a row or use the “q escape. mbox Indicate that a list of messages be sent to mbox in your home directory when you quit. This is the default action for messages if you do not have the hold option set. next The next command goes to the next message and types it. If given a message list, next goes to the first such message and types it. Thus, next root goes to the next message sent by “root” and types it. The next command can be abbre- viated to simply a newline, which means that one can go to and type a message by simply giving its message number or one of the magic characters “4” “.” or “$”. Thus, prints the current message and 4 prints message 4, as described previously. preserve Same as hold. Cause a list of messages to be held in your system mailbox when you quit. quit Leave Mail and update the file, folder, or system mailbox your were reading. Messages that you have examined are marked as “read” and messages that existed when you 199 3 2-32 Mail Reference Manual started are marked as “old.” If you were editing your system mailbox and if you have set the binary option hold, all messages which have not been deleted, saved, or mboxed will be retained in your system mailbox. If you were editing your system mailbox and you did not have hold set, all messages which have not been deleted, saved, or preserved will be moved to the file mbox in your home directory. reply Frame a reply to a single message. The reply will be sent to the person who sent you the message to which you are replying, plus all the people who received the original message, except you. You can add people using the "t and "¢ tilde escapes. The subject in your reply is formed by prefacing the subject in the original message with “Re:” unless it already began thus. If the original message included a “reply-to” header field, the reply will go only to the recipient named by “reply-to.” You type in your message using the same conventions available to you through the mail command. savelt is often useful to be able to save messages on related topics in a file. The save command gives you ability to do this. The save command takes as argument a lit of message numbers, followed by the name of the file on which to save the messages. The messages are appended to the named file, thus allowing one to keep several messages in the file, stored in the order they were put there. The save command can be abbreviated to s. An example of the save command relative to our running example is: s 1 2 tuitionmatil Saved messages are not automatically saved in mbox at quit time, nor are they selected by the next command described above, unless explicitly specified. set Set an option or give an option a value. Used to customize Mail. Section 5.3 contains a list of the options. Options can be binary, in which case they are on or off, or valued. To set a binary option option on, do set option To give the valued option option the value value, do set option=value Several options can be specified in a single set command. shell The shell command allows you to escape to the shell. and allows you to type commands to it. | Shell invokes an interactive shell When you leave the shell, you will return to Mail. The shell used is a default assumed by Mail; you can override this default by setting the valued option “SHELL,” eg: set SHELL=/bin/csh source The source command reads Mail commands from a file. It is useful when you are try- ing to fix your “.mailrc” file and you need to re-read it. top The top command takes a message list and prints the first five lines of each addressed message. It may be abbreviated to to. If you wish, you can change the number of lines that top prints out by setting the valued option “toplines.” On a CRT terminal, set toplines=10 might be preferred. type Print a list of messages on your terminal. If you have set the option crt to a number and the total number of lines in the messages you are printing exceed that specified by crt, the messages will be printed by a terminal paging program such as more. Mail Reference Manual 2-33 undelete The undelete command causes a message that had been deleted previously to regain its initial status. Only messages that have been deleted may be undeleted. This command may be abbreviated to u. unset Reverse the action of setting a binary or valued option. visual It is often useful to be able to invoke one of two editors, based on the type of terminal one is using. To invoke a display oriented editor, you can use the visual command. The operation of the visual command is otherwise identical to that of the edit command. Both the edit and visual commands assume some default text editors. These default editors can be overridden by the valued options “EDITOR” and “VISUAL” for the standard and screen editors. You might want to do: set EDITOR=/usr/ucb/ex VISUAL=/usr/ucb/vi write The save command always writes the entire message, including the headers, into the file. If you want to write just the message itself, you can use the write command. The write command has the same syntax as the save command, and can be abbreviated to simply w. Thus, we could write the second message by doing: w 2 file.c As suggested by this example, the write command is useful for such tasks as sending and receiving source program text over the message system. Z Mail presents message headers in windowfuls as described under the headers com- mand. You can move Mail’s attention forward to the next window by giving the 7+ command. Analogously, you can move to the previous window with: Z—— 5.3. Custom options Throughout this manual, we have seen examples of binary and valued options. This sec- tion describes each of the options in alphabetical order, including some that you have not seen yet. To avoid confusion, please note that the options are either all lower case letters or all upper case letters. When I start a sentence such as: “Ask” causes Mail to prompt you for a subject header, I am only capitalizing “ask” as a courtesy to English. EDITOR The valued option “EDITOR” defines the pathname of the text editor to be used in the edit command and “e. If not defined, a standard editor is used. SHELL The valued option “SHELL” gives the path name of your shell. This shell is used for the ! command and ! escape. In addition, this shell expands file names with shell metacharacters like * and ? in them. VISUAL The valued option “VISUAL” defines the pathname of your screen editor for use in the visual command and “v escape. one. A standard screen editor is used if you do not define 2-34 Mail Reference Manual append The “append” option is binary and causes messages saved in mbox to be appended to the end rather than prepended. Normally, Mailwill mbox in the same order that the system puts messages in your system mailbox. By setting “append,” you are requesting that mbox be appended to regardless. It is in any event quicker to append. ask “Ask” is a binary option which causes Mail to prompt you for the subject of each message you send. If you respond with simply a newline, no subject field will be sent. askcc “Askcc” is a binary option which causes you to be prompted for additional carbon copy recipients at the end of each message. Responding with a newline shows your satisfac- tion with the current list. | autoprint “Autoprint” is a binary option which causes the delete command to behave like dp — thus, after deleting a message, the next one will be typed automatically. This is useful to quickly scanning and deleting messages in your mailbox. debug The binary option “debug” causes debugging information to be displayed. Use of this option is the same as useing the —d command line flag. dot “Dot” is a binary option which, if set, causes Mail to interpret a period alone on a line as the terminator of a message you are sending. escape To allow you to change the escape character used when sending mail, you can set the valued option “escape.” Only the first character of the “escape” option is used, and it must be doubled if it is to appear as the first character of a line of your message. If you change your escape character, then ~ loses all its special meaning, and need no longer be doubled at the beginning of a line. folder The name of the directory to use for storing folders of messages. If this name begins with a ¢/’ Mail considers it to be an absolute pathname; otherwise, the folder directory is found relative to your home directory. hold The binary option “hold” causes messages that have been read but not manually dealt with to be held in the system mailbox. This prevents such messages from being automatically swept into your mbox. ignore The binary option ‘“ignore” causes RUBOUT characters from your terminal to be ignored and echoed as @’s while you are sending mail. RUBOUT characters retain their original meaning in Mail command mode. Setting the “ignore” option is equivalent to supplying the —1 flag on the command line as described in section 6. ignoreeof An option related to “dot” is “ignoreeof” which makes Mail refuse to accept a control—d as the end of a message. “Ignoreeof” also applies to Mail command mode. keep The “keep” option causes Mail to truncate your system mailbox instead of deleting it when it is empty. This is useful if you elect to protect your mailbox, which you would do with the shell command: chmod 600 /usr/spool/mail/yourname | Mail Reference Manual 2-35 your mail, although people usually don’t. keepsave When you save a message, Mail usually discards it when you quit. To retain all saved messages, set the ‘“keepsave” option. metoo When sending mail to an alias, Mail makes sure that if you are included in the alias, that mail will not be sent to you. members of the group. This is useful if a single alias is being used by all If however, you wish to receive a copy of all the messages you send to the alias, you can set the binary option “metoo.” noheader The binary option “noheader” suppresses the printing of the version and headers when Mail is first invoked. Setting this option is the same as using —N on the command line. nosave Normally, when you abort a message with two RUBOUTs, Mail copies the partial letter to the file “dead.letter” in your home directory. Setting the binary option “nosave” prevents this. quiet The binary option “quiet” suppresses the printing of the version when Mail is first invoked, as well as printing the for example “Message 4.” from the type command. record If you love to keep records, then the valued option “record” can be set to the name of a file to save your outgoing mail. Each new message you send is appended to the end of the file. screen When Mail initially prints the message headers, it determines the number to print by looking at the speed of your terminal. valued option The faster your terminal, the more it prints. ‘“screen” overrides this The calculation and specifies how many message headers you want printed. This number is also used for scrolling with the z command. sendmail To alternate delivery system, set the “sendmail” option to the full pathname of the program to use. Note: this is not for everyone! Most people should use the default delivery system. ‘toplines The valued option “toplines” defines the number of lines that the “top” command will print out instead of the default five lines. verbose The binary option "verbose” causes Mail to invoke sendmail with the —v flag, which causes 1t to go into versbose mode and announce expansion of aliases, etc. Setting the ”verbose” option is equivalent to invoking Mail with the —v flag as described in section 6. 2-36 Mail Reference Manual 6. Command line options This section describes command line options for Mail and what they are used for. —N Suppress the initial printing of headers. ~d Turn on debugging information. Not of general interest. —f file Show the messages in file instead of your system mailbox. If file is omitted, Mail reads mbox in your home directory. —i Ignore tty interrupt signals. Useful on noisy phone lines, which generate spurious RUBOUT or DELETE characters. It’s usually more effective to change your interrupt character to control—c, for which see the stty shell command. —n Inhibit reading of /usr/lib/Mail.rc. Not generally useful, since /usr/lib/Mail.rc is usually empty. —s string Used for sending mail. String is used as the subject of the message being composed. If string contains blanks, you must surround it with quote marks. —Uu name Read names’s mail instead of your own. Unwitting others often neglect to protect their mailboxes, but discretion is advised. Essentially, —u user is a shorthand way of doing —f /usr/spool/user. —v Use the —v flag when invoking sendmail. This feature may also be enabled by setting the the option "verbose”. The following command line flags are also recognized, but are intended for use by pro- grams invoking Mail and not for people. —T file Arrange to print on file the contents of the article-id fields of all messages that were either read or deleted. —T is for the readnews program and should NOT be used for reading your mail. —h number Pass on hop count information. Mail will take the number, increment it, and pass it with —h to the mail delivery system. —h only has effect when sending mail and is used for network mail forwarding. —r name Used for network mail forwarding: interpret name as the sender of the message. name and —r are simply sent along to the mail delivery system. The Also, Mail will wait for the message to be sent and return the exit status. Also restricts formatting of message. Note that —h and —r, which are for network mail forwarding, are not used in practice since mail forwarding is now handled separately. They may disappear soon. Mail Reference Manual 2-37 7. Format of messages This section describes the format of messages. Messages begin with a from line, which consists of the word “From” followed by a user name, followed by anything, followed by a date in the format returned by the ctime library routine described in section 3 of the Unix A possible ctime format date is: Programmer’s Manual. Tue Dec 1 10:58:23 1981 The ctime date may be optionally followed by a single space and a time zone indication, which should be three capital letters, such as PDT. Following the from line are zero or more header field lines. the form: Each header field line is of | name: information Name can be anything, but only certain header fields are recognized as having any meaning. The recognized header fields are: article-id, bcc, cc, from, reply-to, sender, subject, and to. Other header fields are also significant to other systems; see, for example, the current Arpanet message standard for much more on this topic. A header field can be continued onto following lines by making the first character on the following line a space or tab character. If any headers are present, they must be followed by a blank line. The part that follows is called the body of the message, and must be ASCII text, not containing null characters. Each line in the message body must be terminated with an ASCII newline character and no line may be longer than 512 characters. If binary data must be passed through the mail sys- tem, it is suggested that this data be encoded in a system which encodes six bits into a print- able character. For example, one could use the upper and lower case letters, the digits, and the characters comma and period to make up the 64 characters. Then, one can send a 16-bit binary number as three characters. These characters should be packed into lines, preferably lines about 70 characters long as long lines are transmitted more efficiently. The message delivery system always adds a blank line to the end of each message. This blank line must not be deleted. The UUCP message delivery system sometimes adds a blank line to the end of a message each time it is forwarded through a machine. It should be noted that some network transport protocols enforce limits to the lengths of messages. 2-38 Mail Reference Manual 8. Glossary This section contains the definitions of a few phrases peculiar to Mazl. alias An alternative name for a person or list of people. flag An option, given on the command line of Mail, prefaced with a —. flag. | For example, —f is a header field At the beginning of a message, a line which contains information that is part of the structure of the message. Popular header fields include to, cc, and subject. matl A collection of messages. matlbox Often used in the phrase, “Have you read your mail?” - The place where your mail is stored, typically in the directory /usr/spool/mail. message A single letter from someone, initially stored in your mailbox. message list A string used in Mail command mode to describe a sequence of messages. option A piece of special purpose information used to tailor Mail to your taste. specified with the set command. Options are | Mail Reference Manual 2-39 9. Summary of commands, options, and escapes This section gives a quick summary of the Mail commands, binary and valued options, and tilde escapes. The following table describes the commands: Command ! ... ... ... ... Description. . ... Single command escape to shell Back up to previous message Print Type message with ignored fields Reply Type Reply to author of message only Type message with ignored fields alias Define an alias as a set of user names alternates chdir copy List other names you are known by Change working directory, home by default Copy a message to a file or folder. delete Delete a list of messages dt endif Delete current message, type next message End of conditional statement; see if edit Edit a list of messages else exit file Start of else part of conditional; see if Leave mail without changing anything Interrogate/change current mail file folder Same as file folders from List the folders in your folder directory List headers of a list of messages headers help List current window of messages Print brief summary of Mail commands hold Same as preserve if ignore list local mail Conditional execution of Mail commands Set/examine list of ignored header fields List valid Mail commands List other names for the local host Send mail to specified names mbox next preserve quit reply save set shell top Arrange to save a list of messages in mbox Go to next message and type it Arrange to leave list of messages in system mailbox Leave Mail; update system mailbox, mbox as appropriate Compose a reply to a message Append messages, headers included, on a file Set binary or valued options Invoke an interactive shell Print first so many (5 by default) lines of list of messages type Print messages undelete unset Undelete list of messages Undo the operation of a set visual write Z Invoke visual editor on a list of messages Append messages to a file, don’t include headers Scroll to next/previous screenful of headers 2-40 Mail Reference Manual The following table describes the options. Each option is shown as being either a binary or valued option. EDITOR valued Pathname of editor for e and edit SHELL valued Pathname of shell for shell, TM! and ! VISUAL valued Pathname of screen editor for v, visual append binary Always append messages to end of mbox ask binary Prompt user for Subject: field when sending askcc binary Prompt user for additional Cc’s at end of message autoprint binary Print next message after delete crt valued Minimum number of lines before using more debug binary Print out debugging information Accept . alone on line to terminate message input dot binary escape valued Escape character to be used instead of ~ folder valued Directory to store folders in hold binary Hold messages in system mailbox by default ignore binary Ignore RUBOUT while sending mail ignoreeof binary Don’t terminate letters/command input with D keep binary Don’t unlink system mailbox when empty Don’t delete saved messages by default keepsave binary metoo binary Include sending user in aliases noheader nosave binary binary Suppress initial printing of version and headers Don’t save partial letter in dead.letter quiet binary Suppress printing of Mail version and message numbers record valued File to save all outgoing mail in Size of window of message headers for z, etc. screen valued sendmail valued Choose alternate mail delivery system toplines valued Number of lines to print in top verbose binary Invoke sendmail with the —v flag The following table summarizes the tilde escapes available while sending mail. Escape Arguments " command Execute shell command "c name ... Add names to Cc: field “d Read dead.letter into message "e “f Invoke text editor on partial message messages “h “m Description Read named messages Edit the header fields messages D Read named messages, right shift by tab Print message entered so far Abort entry of letter; like RUBOUT T filename "s string Set Subject: field to string t name ... Add names to To: field "W filename command Write message on file Pipe message through command 4 4 “q string Quote a ~ in front of string “v 1 Read file into message Invoke screen editor on message Mail Reference Manual 2-41 The following table shows the command line flags that Mail accepts: Flag Description Suppress the initial printing of headers Article-id’s of read/deleted messages to file Turn on debugging —f file Show messages in file or ~/mbox —h number Pass on hop count for mail forwarding —~1 Ignore tty interrupt signals —n Inhibit reading of /usr/lib/Mail.rc —r name Pass on name for mail forwarding —s string Use string as subject in outgoing mail —u name Read name’s mail instead of your own -V Invoke sendmail with the —v flag Notes: —T, —d, —h, and —r are not for human use. 10. Conclusion Mail is an attempt to provide a simple user interface to a variety of underlying message systems. Thanks are due to the many users who contributed ideas and testing to Mail. BC 2-43 BC — An Arbitrary Precision Desk-Calculator Language Lorinda Cherry Robert Morris Bell Laboratories Murray Hill, New Jersey 07974 Introduction BC is a language and a compiler for doing arbitrary precision arithmetic on the UNIXT time-sharing system [1]. The compiler was written to make conveniently available a collection of routines (called DC [5]) which are capable of doing arithmetic on integers of arbitrary size. The compiler is by no means intended to provide a complete programming language. It is a minimal language facility. There is a scaling provision that permits the use of decimal point notation. Provision is made for input and output in bases other than decimal. Numbers can be converted from decimal to octal by simply setting the output base to equal 8. The actual limit on the number of digits that can be handled depends on the amount of storage available on the machine. Manipulation of numbers with many hundreds of digits is possible even on the smallest versions of UNIX. The syntax of BC has been deliberately selected to agree substantially with the C language [2]. Those who are familiar with C will find few surprises in this language. Simple Computations with Integers The simplest kind of statement is an arithmetic expression on a line by itself. For instance, if you type in the line: | 142857 + 285714 the program responds immediately with the line 428571 The operators —, *, /, %, and " can also be used; they indicate subtraction, multiplication, division, remaindering, and exponentiation, respectively. Division of integers produces an integer result truncated toward zero. Division by zero produces an error comment. Any term in an expression may be prefixed by a minus sign to indicate that it is to be negated (the ‘unary’ minus sign). The expression 3 is interpreted to mean that —3 is to be added to 7. A a - - More complex expressions with several operators and with parentheses are interpreted just as in Fortran, with ~ having the greatest binding power, then * and % and /, and finally + and —. Contents of parentheses are evaluated before material outside the parentheses. Exponentiations are performed from right to left and the other operators from left to right. The two expressions t UNIX is a trademark of Bell Laboratories. 2-44 BC a’b’c and a"(b’c) are equivalent, as are the two expressions a*b*c and (a*b)*c BC shares with Fortran and C the undesirable convention that a/b*c is equivalent to (a/b)*c Internal storage registers to hold numbers have single lower-case letter names. The value of an expression can be assigned to a register in the usual way. The statement X=X+ 3 has the effect of increasing by three the value of the contents of the register named x. When, as in this case, the outermost operator is an =, the assignment is performed but the result is not printed. Only 26 of these named storage registers are available. There is a built-in square root function whose result is truncated to an integer (but see scaling below). The lines x = sqrt(191) X . produce the printed result 13 Bases There are special internal quantities, called ‘ibase’ and ‘obase’. The contents of ‘ibase’, initially set to 10, determines the base used for interpreting numbers read in. For example, the lines ibase = 8 11 will produce the output line 9 and you are all set up to do octal to decimal conversions. Beware, however of trying to change the input base back to decimal by typing ibase = 10 Because the number 10 is interpreted as octal, this statement will have no effect. For those who deal in hexadecimal notation, the characters A—F are permitted in numbers (no matter what base is in effect) and are interpreted as digits having values 10—15 respectively. The statement ibase = A will change you back to decimal input base no matter what the current ‘input base is. Negative and large positive input bases are permitted but useless. No mechanism has been pro- vided for the input of arbitrary numbers in bases less than 1 and greater than 16. The contents of ‘obase’, initially set to 10, are used as the base for output numbers. lines obase = 16 1000 The BC 2-45 3K8 which is to be interpreted as a 3-digit hexadecimal number. Very large output bases are permitted, and they are sometimes useful. For example, large numbers can be output in groups of five digits by setting ‘obase’ to 100000. Strange (i.e. 1, 0, or negative) output bases are handled appropriately. Very large numbers are split across lines with 70 characters per line. Lines which are \. Decimal output conversion is practically instantaneous, but output of continued end with very large numbers (i.e., more than 100 digits) with other bases is rather slow. Non-decimal output conversion of a one hundred digit number takes about three seconds. Tt is best to remember that ‘ibase’ and ‘obase’ have no effect whatever on the course of internal computation or on the evaluation of expressions, but only affect input and output conversion, respectively. Scaling A third special internal quantity called ‘scale’ is used to determine the scale of calculated quantities. Numbers may have up to 99 decimal digits after the decimal point. This fractional part is retained in further computations. We refer to the number of digits after the decimal point of a number as its scale. | When two scaled numbers are combined by means of one of the arithmetic operations, the result has a scale determined by the following rules. For addition and subtraction, the scale of the resultis the larger of the scales of the two operands. In this case, there is never any truncation of the result. For multiplications, the scale of the resultis never less than the maximum of the two scales of the operands, never more than the sum of the scales of the operands and, subject to those two restrictions, the scale of the resultis set equal to the contents of the internal quantity ‘scale’. The scale of a quotient is the contents of the internal quantity ‘scale’. The scale of a remainder is the sum of the scales of the quotient and the divisor. The result of an exponentiation is scaled as if the implied multiplications were performed. An exponent must be an integer. The scale of a square root is set to the maximum of the scale of the argument and the contents of ‘scale’. | All of the internal operations are actually carried out in terms of integers, with digits being discarded when necessary. In every case where digits are discarded, truncation and not rounding is performed. The contents of ‘scale’ must be no greater than 99 and no less than 0. It is initially set to 0. In case you need more than 99 fraction digits, you may arrange your own scaling. The internal quantities ‘scale’, ‘ibase’, and ‘obase’ can be used in expressions just like other variables. The line scale = scale + 1 increases the value of ‘scale’ by one, and the line scale | causes the current value of ‘scale’ to be printed. The value of ‘scale’ retains its meaning as a number of decimal digits to be retained in to 10. The internal computainternal computation even when ‘ibase’or ‘obase’ are not equal tions (which are still conducted in demmdl, regardless of the bases) are performed to the specified number of decimal digits, never hexadecimal or octal or any other kind of digits. Functions The name of a function is a single lower-case letter. Function names are permitted to collide with simple va able nampg Tw,:nfy SiX different defined functions are permitted in dd tion to the twenty- 92-46 BC define a(x){ begins the definition of a function with one argument. This line must be followed by one or more statements, which make up the body of the function, ending with a right brace }. Return of control from a function occurs when a return statement is executed or when the end of the functionis reached. The return statement can take either of the two forms return return(x) In the first case, the value of the function is 0, and in the second, the value of the expression in parentheses. Variables used in the function can be declared as automatic by a statement of the form auto x,y,z There can be only one ‘auto’ statement in a function and it must be the first statement in the definition. These automatic variables are allocated space and initialized to zero on entry to the function and thrown away on return. The values of any variables with the same names outside the function are not disturbed. Functions may be called recursively and the automatic variables at each level of call are protected. The parameters named in a function definition are treated in the same way as the automatic variables of that function with the single excep- tion that they are given a value on entry to the function. An example of a function definition is define a(x,y)/{ auto z = X*y return(z) j The value of this function, when called, will be the product of its two arguments. A function is called by the appearance of its name followed by a string of arguments enclosed in parentheses and separated by commas. The result is unpredictable if the wrong number of arguments is used. Functions with no arguments are defined and called using parentheses with nothing between them: b(). If the function a above has been defined, then the line a(7,3.14) would cause the result 21.98 to be printed and the line x = a(a(3,4),5) would cause the value of x to become 60. Subscripted Variables A single lower-case letter variable name followed by an expression in brackets is called a subscripted variable (an array element). The variable name is called the array name and the expression in brackets is called the subscript. Only one-dimensional arrays are permitted. The names of arrays are permitted to collide with the names of simple variables and function names. Any fractional part of a subscriptis discarded before use. than or equal to zero and less than or equal to 2047. Subscrlpts must be greater Subscripted variables may be freely used in expressions, in function calls, and in return automaticin a functio definltlon by the use of empty brackets: e "._" C @ C‘ ‘11/\/\ LS o - o f - - a ~ o C C' [ - - ¢! - ax - - ja U o U ““““““““““ jo . D used as & lon o QD << [¢»] arra - A An EJ = statements. or may be declared as PNy e < ¥ L,. _4.1',_____ ~ o~ BC 2-47 f(a[]) define f(a[]) auto al] When an array name is so used, the whole contents of the array are copied for the use of the function, and thrown away on exit from the function. Array names which refer to whole arrays cannot be used in any other contexts. Control Statements The ‘if’, the ‘while’, and the ‘for’ statements may be used to alter the flow within pro- grams or to cause iteration. The range of each of them is a statement or a compound statement consisting of a collection of statements enclosed in braces. They are written in the following way if(relation) statement while(relation) statement for(expressionl; relation; expression2) statement or if(relation) {statements} while(relation) {statements} for(expressionl; relation; expression2) {statements} A relation in one of the control statements is an expression of the form X>y where two expressions are related by one of the six relational operators <, >, <=, >=, ==, or '=. The relation == stands for ‘equal to’ and != stands for ‘not equal to’. The meaning of the remaining relational operators is clear. BEWARE of using = instead of == in a relational. Unfortunately, both of them are legal, so you will not get a diagnostic message, but = really will not do a comparison. The ‘if’ statement causes execution of its range if and only if the relation is true. Then control passes to the next statement in sequence. The ‘while’ statement causes execution of its range repeatedly as long as the relation is true. The relation is tested before each execution of its range and if the relation is false, control passes to the next statement beyond the range of the while. The ‘for’ statement begins by executing ‘expressionl’. Then the relation is tested and, if true, the statements in the range of the ‘for’ are executed. Then ‘expression2’ is executed. The relation is tested, and so on. The typical use of the ‘for’ statement is for a controlled iteration, as in the statement for(i=1; i<=10; i=i+1) i which will print the integers from 1 to 10. Here are some examples of the use of the control statements. define f(n){ auto 1, X x=1 for(i=1; i<=n; i=i+1) x=x*i return(x) h The line 2-48 BC f(a) will print a factorial if a is a positive integer. Here is the definition of a function which will compute values of the binomial coefficient (m and n are assumed to be positive integers). define b(n,m){ auto x, ] x=1 for(j=1; j<=m; j=j+1) x=x*(n—j+1)/j return(x) ' j The following function computes values of the exponential function by summing the appropriate series without regard for possible truncation errors: scale = 20 define e(x)/{ e — O I S0 oo e auto a, b, ¢, d, n while(1==1){ a = a*x b = b*n c=c+ a/b n=n-+1 if(c==d) return(c) d =c Some Details There are some language features that every user should know about even if he will not use them. | Normally statements are typed one to a line. It is also permissible to type several state- ments on a line separated by semicolons. If an assignment statement is parenthesized, it then has a value and it can be used any- where that an expression can. For example, the line (x=y+17) not only makes the indicated assignment, but also prints the resulting value. Here is an example of a use of the value of an assignment statement even when it is not parenthesized. causes a value to be assigned to x and also increments i before it is used as a subscript. The following constructs work in BC in exactly the same manner as they do in the C Consult the appendix or the C manuals [2] for their exact workings. language. BC x=y=2z is the same as 2-49 x=(y=2z) X =+y X = X+y X =—y X = X—V x =%y x = x*y x =/y x = x/y x=%y x = x%Yy X ="y x++ X =Xy (x=x+1)—1 X —— (x=x—1)+1 ++x x = x+1 ——x x = x—1 Even if you don’t intend to use the constructs, if you type one inadvertently, something correct but unexpected may happen. | WARNING! In some of these constructions, spaces are significant. There is a real difference between x=—y and x= —y. The first replaces x by x—y and the second by —y. Three Important Things 1. To exit a BC program, type ‘quit’. 9. There is a comment convention identical to that of C and of PL/I. Comments begin with ‘/*’ and end with ‘*/. 3. There is a library of math functions which may be obtained by typing at command l\evel bc —I1 This command will load a set of library functions which, at the time of writing, consists of sine (named ‘s’), cosine (‘c’), arctangent (‘a’), natural logarithm (‘1), exponential (‘e’) and Bessel functions of integer order (‘j(n,x)’). Doubtless more functions will be added in time. The library sets the scale to 20. You can reset it to something else if you like. The design of these mathematical library routines is discussed elsewhere [3]. If you type be file ... BC will read and execute the named file or files before accepting commands from the keyboard. In this way, you may load your favorite programs and function definitions. Acknowledgement The compiler is written in YACC [4]; its original version was written by S. C. Johnson. References [3] K. Thompson and D. M. Ritchie, UNIX Programmer’s Manual, Bell Laboratories, 1978. B. W. Kernighan and D. M. Ritchie, The C Programming Language, Prentice-Hall, 1978. R. Morris, A Library of Reference Standard Mathematical Subroutines, Bell Labora- [4] S. C. Johnson, YACC — Yet Another Compiler-Compiler. Bell Laboratories Computing [5] R. Morris and L. L. Cherry, DC — An Interactive Desk Calculator. [1] [2] tories internal memorandum, 1975. Science Technical Report #32, 1978. 2-50 BC Appendix 1. Notation In the following pages syntactic categories are in italics; literals are in bold; material in brackets [] is optional. 2. Tokens Tokens consist of keywords, identifiers, constants, operators, and separators. Token Newline characters or semicolons separate state- separators may be blanks, tabs or comments. ments. 2.1. Comments Comments are introduced by the characters /* and terminated by */. 2.2. Identifiers There are three kinds of identifiers — ordinary identifiers, array identifiers and function identifiers. All three types consist of single lower-case letters. Array identifiers are followed by square brackets, possibly enclosing an expression describing a subscript. Arrays are singly dimensioned and may contain up to 2048 elements. be indexed from 0 to 2047. Indexing begins at zero so an array may Subscripts are truncated to integers. Function identifiers are fol- lowed by parentheses, possibly enclosing arguments. The three types of identifiers do not conflict; a program can have a variable named x, an array named x and a function named x, all of which are separate and distinct. 2.3. Keywords The following are reserved keywords: ibase if obase break scale define sqrt auto length return while quit for 2.4. Constants Constants consist of arbitrarily long numbers with an optional decimal point. The hexa- decimal digits A—F are also recognized as digits with values 10—15, respectively. 3. Expressions The value of an expression is printed unless the main operator is an assignment. cedence is the same as the order of presentation here, with highest appearing first. right associativity, where applicable, is discussed with each operator. Pre- Left or BC 2-51 3.1. Primitive expressions 3.1.1. Named expressions Named expressions are places where values are stored. Simply stated, named expressions are legal on the left side of an assignment. The value of a named expression is the value stored in the place named. 3.1.1.1. identifiers Simple identifiers are named expressions. They have an initial value of zero. 3.1.1.2. array-namelexpression] Array elements are named expressions. They have an initial value of zero. 3.1.1.3. scale, ibase and obase The internal registers scale, ibase and obase are all named expressions. scale is the number of digits after the decimal point to be retained in arithmetic operations. scale has an initial value of zero. ibase and obase are the input and output number radix respectively. Both ibase and obase have initial values of 10. 3.1.2. Funétion calls 3.1.2.1. function-name ([expression[,expression...]]) A function call consists of a function name followed by parentheses containing a comma-separated list of expressions, which are the function arguments. A whole array passed as an argument is specified by the array name followed by empty square brackets. All function arguments are passed by value. As a result, changes made to the formal parameters have no effect on the actual arguments. If the function terminates by executing a return statement, the value of the function is the value of the expression in the parentheses of the return statement or is zero if no expression is provided or if there is no return statement. 3.1.2.2. sqrt(expression) The result is the square root of the expression. The result is truncated in the least significant decimal place. The scale of the result is the scale of the expression or the value of scale, whichever is larger. 3.1.2.3. length (expression) The result is the total number of significant decimal digits in the expression. The scale of the result is zero. 3.1.2.4. scale(expression) The result is the scale of the expression. The scale of the result is zero. 3.1.3. Constants Constants are primitive expressions. 3.1.4. Parentheses An expression surrounded by parentheses is a primitive expression. The parentheses are used to alter the normal precedence. 2-52 BC 3.2. Unary operators The unary operators bind right to left. 3.2.1. —expression The result is the negative of the expression. 3.2.2. ++named-expression The named expression is incremented by one. The result is the value of the named expression after incrementing. 3.2.3. ——named-expression The named expression is decremented by one. The result is the value of the named expression after decrementing. 3.2.4. named-expression++ The named expression is incremented by one. The result is the value of the named expression before incrementing. 3.2.5. named-expression —— The named expression is decremented by one. The result is the value of the named expression before decrementing. 3.3. Exponentiation operator The exponentiation operator binds right to left. 3.3.1. expression " expression The result is the first expression raised to the power of the second expression. second expression must be an integer. The If a is the scale of the left expression and b is the abso- lute value of the right expression, then the scale of the result is: min (aXb,max (scale,a)) 3.4. Multiplicative operators The operators *, /, % bind left to right. 3.4.1. expression * expression The result is the product of the two expressions. If a and b are the scales of the two expressions, then the scale of the result is: min (a+b,max (scale,a,b)) 3.4.2. expression / expression The result is the quotient of the two expressions. The scale of the result is the value of scale. 3.4.3. expression % expression The % operator produces the remainder of the division of the two expressions. precisely, a % b is a—a/b*b. The scale of the result is the sum of the scale of the divisor and the value of scale More BC 2-53 3.5. Additive operators The additive operators bind left to right. 3.5.1. expression + expression The result is the sum of the two expressions. The scale of the result is the maximun of the scales of the expressions. 3.5.2. expression — expression The result is the difference of the two expressions. The scale of the result is the max- imum of the scales of the expressions. 3.6. assignment operators The assignment operators bind right to left. 3.6.1. named-expression = expression This expression results in assigning the value of the expression on the right to the named expression on the left. 3.6.2. named-expression =+ expression 3.6.3. named-expression =— expression 3.6.4. named-expression =* expression 3.6.5. named-expression =/ expression 3.6.6. named-expression =% expression 3.6.7. named-expression =" expression The result of the above expressions is equivalent to “named expression = named expression OP expression”, where OP is the operator after the = sign. 4. Relations Unlike all other operators, the relational operators are only valid as the object of an if, while, or inside a for statement. 4.1. expression < expression 4.2. expression > expression 4.3. expression <= expression 4.4. expression >= expression 4.5. expression == expression 4.6. expression != expression 5. Storage classes There are only two storage classes in BC, global and automatic (local). Only identifiers that are to be local to a function need be declared with the auto command. The arguments to a function are local to the function. All other identifiers are assumed to be global and 2-54 BC available to all functions. All identifiers, global and local, have initial values of zero. Identifiers declared as auto are allocated on entry to the function and released on returning from the function. They therefore do not retain values between function calls. auto arrays are specified by the array name followed by empty square brackets. Automatic variables in BC do not work in exactly the same way as in either C or PL/I. On entry to a function, the old values of the names that appear as parameters and as automatic variables are pushed onto a stack. Until return is made from the function, reference to these names refers only to the new values. 6. | Statements Statements must be separated by semicolon or newline. Except where altered by control statements, execution is sequential. 6.1. Expression statements When a statement is an expression, unless the main operator is an assignment, the value of the expression is printed, followed by a newline character. 6.2. Compound statements Statements may be grouped together and used when one statement is expected hv surrounding them with { }. 6.3. Quoted string statements “any string” This statement prints the string inside the quotes. 6.4. If statements if (relation) statement The substatement is executed if the relation is true. 6.5. While statements while (relation) statement The statement is executed while the relation is true. The test occurs before each execu- tion of the statement. 6.6. For statements for (expression; relation; expression) statement The for statement is the same as first-expression while (relation) { statement last-expression } J All three expressions must be present. 6.7. Break statements break break causes termination of a for or while statement. BC 2-55 6.8. Auto statements auto identifier|,identifier] The auto statement causes the values of the identifiers to be pushed down. The identifiers can be ordinary identifiers or array identifiers. Array identifiers are specified by following the array name by empty square brackets. The auto statement must be the first statement in a function definition. 6.9. Define statements define([parameter|[,parameter...]1]) { statements } The define statement defines a function. The parameters may be ordinary identifiers or array names. Array names must be followed by empty square brackets. 6.10. Return statements return return(expression) The return statement causes termination of a function, popping of its auto variables, and specifies the result of the function. The first form is equivalent to return(0). The result of the function is the result of the expression in parentheses. 6.11. Quit The quit statement stops execution of a BC program and returns control to UNIX when it is first encountered. Because it is not treated as an executable statement, it cannot be used in a function definition or in an if, for, or while statement. DC 2-57 DC — An Interactive Desk Calculator Robert Morris Lorinda Cherry Bell Laboratories Murray Hill, New Jersey 07974 DC is an arbitrary precision arithmetic package implemented on the UNIXT time-sharing system in the form of an interactive desk calculator. It works like a stacking calculator using reverse Polish notation. Ordinarily DC operates on decimal integers, but one may specify an input base, output base, and a number of fractional digits to be maintained. A language called BC [1] has been developed which accepts programs written in the familiar style of higher-level programming languages and compiles output which is interpreted by DC. Some of the commands described below were designed for the compiler interface and are not easy for a human user to manipulate. Numbers that are typed into DC are put on a push-down stack. DC commands work by taking the top number or two off the stack, performing the desired operation, and pushing the result on the stack. If an argument is given, input is taken from that file until its end, then from the standard input. SYNOPTIC DESCRIPTION Here we describe the DC commands that are intended for use by people. The additional commands that are intended to be invoked by compiled output are described in the detailed description. Any number of commands are permitted on a line. Blanks and new-line characters are ignored except within numbers and in places where a register name is expected. The following constructions are recognized: number The value of the number is pushed onto the main stack. A number is an unbroken string of the digits 0-9 and the capital letters A—F which are treated as digits with values 10—15 respectively. The number may be preceded by an underscore to input a negative number. Numbers may contain decimal points. + - * % ° The top two values on the stack are added (+), subtracted (=), multiplied (*), divided (/), remaindered (%), or exponentiated (). The two entries are popped off the stack; the result is pushed on the stack in their place. The result of a division is an integer truncated toward zero. See the detailed description below for the treatment of numbers with decimal points. An exponent must not have any digits after the decimal point. T UNIX is a trademark of Bell Laboratories. 2-58 DC SX The top of the main stack is popped and stored into a register named x, where x may be any character. it. If the s is capitalized, x is treated as a stack and the value is pushed onto Any character, even blank or new-line, is a valid register name. Ix The value in register x is pushed onto the stack. The register x is not altered. If thel is capitalized, register x is treated as a stack and its top value is popped onto the main stack. All registers start with empty value which is treated as a zero by the command 1 and is treated as an error by the command L. d The top value on the stack is duplicated. D The top value on the stack is printed. The top value remains unchanged. f All values on the stack and in registers are printed. X treats the top element of the stack as a character string, removes it from the stack, and executes it as a string of DC commands. | [ ...] puts the bracketed character string onto the top of the stack. q exits the program. If executing a string, the recursion level is popped by two. If q is capitalized, the top value on the stack is popped and the string execution level is popped by that value. <x >x =x l<x I>x !=x The top two elements of the stack are popped and compared. Register x is executed if they obey the stated relation. Exclamation point is negation. v replaces the top element on the stack by its square root. truncated to an integer. The square root of an integer is For the treatment of numbers with decimal points, see the detailed description below. ! interprets the rest of the line as a UNIX command. UNIX command terminates. c Control returns to DC when the DC 2-59 The top value on the stack is popped and used as the number radix for further input. If i is capitalized, the value of the input base is pushed onto the stack. No mechanism has been provided for the input of arbitrary numbers in bases less than 1 or greater than 16. o The top value on the stack is popped and used as the number radix for further output. If o is capitalized, the value of the output base is pushed onto the stack. k The top of the stack is popped, and that value is used as a scale factor that influences the number of decimal places that are maintained during multiplication, division, and exponentiation. The scale factor must be greater than or equal to zero and less than 100. If k is capitalized, the value of the scale factor is pushed onto the stack. Z, | The value of the stack level is pushed onto the stack. ? A line of input is taken from the input source (usually the console) and executed. DETAILED DESCRIPTION Internal Representation of Numbers Numbers are stored internally using a dynamic storage allocator. Numbers are kept in the form of a string of digits to the base 100 stored one digit per byte (centennial digits). The string is stored with the low-order digit at the beginning of the string. For example, the representation of 157 is 57,1. After any arithmetic operation on a number, care is taken that all digits are in the range 0—99 and that the number has no leading zeros. The number zero is represented by the empty string. Negative numbers are represented in the 100’s complement notation, which is analogous to two’s complement notation for binary numbers. The high order digit of a negative number is always —1 and all other digits are in the range 0—99. The digit preceding the high order —1 digit is never a 99. The representation of —157 is 43,98,—1. We shall call this the canonical form of a number. The advantage of this kind of representation of negative numbers is ease of addition. When addition is performed digit by digit, the result is formally correct. The result need only be modified, if necessary, to put it into canonical form. Because the largest valid digit is 99 and the byte can hold numbers twice that large, addition can be carried out and the handling of carries done later when that is convenient, as it sometimes is. An additional byte is stored with each number beyond the high order digit to indicate the number of assumed decimal digits after the decimal point. The representation of .001 1s 1,3 where the scale has been italicized to emphasize the fact that it is not the high order digit. T'he value of this extra byte is called the scale factor of the number. The Allocator DC uses a dynamic string storage allocator for all of its internal storage. All reading and writing of numbers internally is done through the allocator. Associated with each string in the allocator is a four-word header containing pointers to the beginning of the string, the end of the string, the next place to write, and the next place to read. Communication between the allocator and DC is done via pointers to these headers. 2-60 DC The allocator initially has one large string on a list of free strings. the one pointing to this string are on a list of free headers. All headers except Requests for strings are made by size. The size of the string actually supplied is the next higher power of 2. When a request for a string is made, the allocator first checks the free list to see if there is a string of the desired size. If none is found, the allocator finds the next larger free string and splits it repeatedly until it has a string of the right size. Left-over strings are put on the free list. If there are no larger strings, the allocator tries to coalesce smaller free strings into larger ones. Since all strings are the result of splitting large strings, each string has a neighbor that is next to it in core and, if free, can be combined with it to make a string twice as long. This is an implementation of the ‘buddy system’ of allocation described in [2]. Failing to find a string of the proper length after coalescing, the allocator asks the sys- tem for more space. The amount of space on the system is the only limitation on the size and number of strings in DC. If at any time in the process of trying to allocate a string, the allo- cator runs out of headers, it also asks the system for more space. There are routines in the allocator for reading, writing, copying, rewinding, forwardspacing, and backspacing strings. All string manipulation is done using these routines. The reading and writing routines increment the read pointer or write pointer so that the characters of a string are read or written in succession by a series of read or write calls. The write pointer is interpreted as the end of the information-containing portion of a string and a call to read beyond that point returns an end-of-string indication. An attempt to write beyond the end of a string causes the allocator to allocate a larger space and then copy the old string into the larger block. Internal Arithmetic All arithmetic operations are done on integers. The operands (or operand) needed for the operation are popped from the main stack and their scale factors stripped off. Zeros are added or digits removed as necessary to get a properly scaled result from the internal arithmetic routine. For example, if the scale of the operands is different and decimal alignhment is required, as it is for addition, zeros are appended to the operand with the smaller scale. After performing the required arithmetic operation, the proper scale factor is appended to the end of the number before it is pushed on the stack. A register called scale plays a part in the results of most arithmetic operations. scale is the bound on the number of decimal places retained in arithmetic computations. scale may be set to the number on the top of the stack truncated to an integer with the k command. K may be used to push the value of scale on the stack. or equal to 0 and less than 100. scale must be greater than The descriptions of the individual arithmetic operations will include the exact effect of scale on the computations. Addition and Subtraction The scales of the two numbers are compared and trailing zeros are supplied to the number with the lower scale to give both numbers the same scale. The number with the smaller scale is multiplied by 10 if the difference of the scales is odd. The scale of the result is then set to the larger of the scales of the two operands. Subtraction is performed by negating the number to be subtracted and proceeding as in addition. Finally, the addition is performed digit by digit from the low order end of the number. The carries are propagated in the usual way. The resulting number is brought into canonical form, which may require stripping of leading zeros, or for negative numbers replacing the high-order configuration 99,—1 by the digit —1. In any case, digits which are not in the range 0—99 must be brought into that range, propagating any carries or borrows that result. DC 2-61 Multiplication The scales are removed from the two operands and saved. positive. The operands are both made Then multiplication is performed in a digit by digit manner that exactly mimics the hand method of multiplying. The first number is multiplied by each digit of the second number, beginning with its low order digit. The intermediate products are accumulated into a partial sum which becomes the final product. The product is put into the canonical form and its sign is computed from the signs of the original operands. The scale of the result is set equal to the sum of the scales of the two operands. If that scale is larger than the internal register scale and also larger than both of the scales of the two operands, then the scale of the result is set equal to the largest of these three last quantities. Division The scales are removed from the two operands. Zeros are appended or digits removed from the dividend to make the scale of the result of the integer division equal to the internal quantity scale. The signs are removed and saved. Division 1s performed much as it would be done by hand. of the two numbers is computed. The difference of the lengths If the divisor is longer than the dividend, zero is returned. Otherwise the top digit of the divisor is divided into the top two digits of the dividend. result is used as the first (high-order) digit of the quotient. The It may turn out be one unit too low, but if it is, the next trial quotient will be larger than 99 and this will be adjusted at the end of the process. The trial digit is multiplied by the divisor and the result subtracted from the dividend and the process is repeated to get additional quotient digits until the remaining dividend is smaller than the divisor. At the end, the digits of the quotient are put into the canonical form, with propagation of carry as needed. The sign is set from the sign of the operands. Remainder The division routine is called and division is performed exactly as described. tity returned is the remains of the dividend at the end of the divide process. truncates toward zero, remainders have the same sign as the dividend. The quan- Since division The scale of the remainder is set to the maximum of the scale of the dividend and the scale of the quotient plus the scale of the divisor. Square Root The scale is stripped from the operand. Zeros are added if necessary to make the integer result have a scale that is the larger of the internal quantity scale and the scale of the operand. The method used to compute sqrt(y) is Newton’s method with successive approximations by the rule Xn+el = 1/2(xn+-x'y“} n The initial guess is found by taking the integer square root of the top two digits. Exponentiation Only exponents with zero scale factor are handled. result is 1. one. If the exponent is zero, then the If the exponent is negative, then it is made positive and the base is divided into The scale of the base is removed. tha reciilt ic nhtained ac €A a nradiiet of fhgqn NAWwWoaraD ()F the hace that WALAYN, A WAL L En) R IJUCALIA/ A LAV yl. \VAW BW L E VIRWS | V3 @ § oA W) y\.’ YV A p ALY, ASLALI N VEAiLACRA U 1 s ) 1 3 of the one-bits in the binarv representation of the exponent. TM nrrnsnnr}d tn fhe nOSifi()r}Q NFA R A IJUL s VA y v p U sw) Enough 1 * 5 1 1 digits of the result are 2-62 DC removed to make the scale of the result the same as if the indicated multiplication had been performed. Input Conversion and Base Numbers are converted to the internal representation as they are read in. The scale stored with a number is simply the number of fractional digits input. Negative numbers are indicated by preceding the number with a -~ The hexadecimal digits A—F correspond to the numbers 10—15 regardless of input base. The i command can be used to change the base of the input numbers. This command pops the stack, truncates the resulting number to an integer, and uses it as the input base for all further input. The input base is initialized to 10 but may, for example be changed to 8 or 16 to do octal or hexadecimal to decimal conversions. The command I will push the value of the input base on the stack. OQutput Commands The command p causes the top of the stack to be printed. It does not remove the top of All of the stack and internal registers can be output by typing the command f. The o command can be used to change the output base. This command uses the top of the the stack. stack, truncated to an integer as the base for all further output. The output base in initialized to 10. It will work correctly for any base. The command O pushes the value of the output base on the stack. Output Format and Base The input and output bases only affect the interpretation of numbers on input and output; they have no effect on arithmetic computations. Large numbers are output with 70 characters per line; a \ indicates a continued line. All choices of input and output bases work correctly, although not all are useful. A particularly useful output base is 100000, which has the effect of grouping digits in fives. Bases of 8 and 16 can be used for decimal-octal or decimal-hexadecimal conversions. Internal Registers Numbers or strings may be stored in internal registers or loaded on the stack from regis- ters with the commands s and 1. The command sx pops the top of the stack and stores the result in register X. the stack. x can be any character. lx puts the contents of register x on the top of The 1 command has no effect on the contents of register x. The s command, how- ever, is destructive. Stack Commands The command ¢ clears the stack. the top of the stack on the stack. The command d pushes a duplicate of the number on The command z pushes the stack size on the stack. The command X replaces the number on the top of the stack with its scale factor. The command Z replaces the top of the stack with its length. Subroutine Definitions and Calls | Enclosing a string in [] pushes the ascii string on the stack. The @ command quits or in PR . RS RIPURY XCCULlIlg ad Sbllllg, pPoOps I FU PSP AU | PR 1. LIlE€ IreCursioll 1evels 1. - DY 4 - LWO. Internal Registers — Programming DC The load and store commands together with [] to store strings, x to execute and the testing commands ‘<’, *>’, ‘=’ ‘1<’ ‘1>’ ‘1=" can be used to program DC. The x command assumes the top of the stack is an string of DC commands and executes it. The testing comtion holds, execute the regis- DC [lipl+ 2-63 si lil0>a]sa Osi lax Push-Down Registers and Arrays These commands were designed for used by a compiler, not by people. push-down registers and arrays. thought of as having individual stacks for each register. the commands S and L. register x. They involve In addition to the stack that commands work on, DC can be These registers are operated on by Sx pushes the top value of the main stack onto the stack for the Lx pops the stack for register x and puts the result on the main stack. mands s and 1 also work on registers but not as push-down stacks. The com- 1 doesn’t effect the top of the register stack, and s destroys what was there before. The commands to work on arrays are : and ;. :x pops the stack and uses this value as an index into the array x. The next element on the stack is stored at this index in x. must be greater than or equal to 0 and less than 2048. stack from the array x. An index ;x is the command to load the main The value on the top of the stack is the index into the array x of the value to be loaded. Miscellaneous Commands The command ! interprets the rest of the line as a UNIX command and passes it to UNIX to execute. One other compiler command is Q. This com- mand uses the top of the stack as the number of levels of recursion to skip. DESIGN CHOICES The real reason for the use of a dynamic storage allocator was that a general purpose program could be (and in fact has been) used for a variety of other tasks. The allocator has some value for input and for compiling (i.e. the bracket [...] commands) where it cannot be known in advance how long a string will be. The result was that at a modest cost in execution time, all considerations of string allocation and sizes of strings were removed from the remainder of the program and debugging was made easier. The allocation method used wastes approximately 25% of available space. The choice of 100 as a base for internal arithmetic seemingly has no compelling advantage. Yet the base cannot exceed 127 because of hardware limitations and at the cost of 5% in space, debugging was made a great deal easier and decimal output was made much faster. The reason for a stack-type arithmetic design was to permit all DC commands from addition to subroutine execution to be implemented in essentially the same way. The result was a considerable degree of logical separation of the final program into modules with very little communication between modules. The rationale for the lack of interaction between the scale and the bases was to provide an understandable means of proceeding after a change of base or scale when numbers had already been entered. An earlier implementation which had global notions of scale and base did not work out well. If the value of scale were to be interpreted in the current input or output base, then a change of base or scale in the midst of a computation would cause great confusion in the interpretation of the results. The current scheme has the advantage that the value of the input and output bases are only used for input and output, respectively, and they are ignored in all other operations. The value of scale is not used for any essential purpose by any part of the program and it is used only to prevent the number of decimal places resulting from the arithmetic operations from growing beyond all bounds. The design rationale for the choices for the scales of the results of arithmetic were that ¢ ase should any significant digits be thrown away if, on appearances, the user actually Q. o _— no 5 In thoarm Lernl. Th JLMUb, e if iha the user O e 13 tha mhore 1 el O wants to add the numbers 1.5 and 3. ULI it oD bccr“cd reasonable --.,‘MLN 4 WwWItnouv rCquuulg him to ULLJ.IGLCDDallly “)}JCLJL)I his rati r obvious 2-64 DC requirements for precision. On the other hand, multiplication and exponentiation produce results with many more digits than their operands and it seemed reasonable to give as a minimum the number of decimal places in the operands but not to give more than that number of digits unless the user asked for them by specifying a value for scale. Square root can be handled in just the same way as multiplication. The operation of division gives arbitrarily many decimal places and there is simply no way to guess how many places the user wants. In this case only, the user must specify a scale to get any decimal places at all. The scale of remainder was chosen to make it possible to recreate the dividend from the quotient and remainder. This is easy to implement; no digits are thrown away. References [1] L. L. Cherry, R. Morris, BC — An Arbitrary Precision Desk-Calculator Language. [2] K. C. Knowlton, A Fast Storage Allocator, Comm. ACM 8, pp. 623-625 (Oct. 1965). Introduction 3-1 PART 3: TEXT EDITORS ULTRIX-32 offers five editors that you can use to create new files and modify existing files. Two of the six articles in this part describe the editor, ed. The remaining four articles describe edit, vi, ex, and sed. This introduction will help you compare the merits and features of the different editors and select an appropriate article. Type of Editor Editor Article edit Line Edit: A Tutorial ed Line A Tutorial Introduction to the UNIX Text Editor Advanced Editing on UNIX vl Screen An Introduction to Display Editing with Vi ex Line Ex Reference Manual sed Stream Sed - A Non-interactive Text Editor Edit and ed were developed for use on hard-copy terminals and video terminals connected to phone links slower than 1200 baud. If you have access to a video terminal on a medium or high-speed line (1200 baud or faster), vi is more appropriate. Ex is a general purpose line edi- tor (often the editor of choice), and sed is suitable for sophisticated users concerned with batch editing. edit | “Edit: A Tutorial” introduces the edit editor at a basic level. ple new to the ULTRIX-32 system. This editor is suitable for peo- Tutorials for four complete editing sessions make up the article on edit. These sessions advance from simple tasks to searching, substitution, and file recovery. ed “A Tutorial Introduction to the UNIX Text Editor” demonstrates the basic commands in ed. This editor is easy to use, but error messages provided with ed are not as helpful as error messages for the other editors. The article includes examples and abundant explanations. “Advanced Editing on UNIX” covers those features of ed not explained in the first article, including using metacharacters, cutting and pasting, and making global changes. 3-2 Introduction vi Vi is the ULTRIX-32 system screen editor, and “An Introduction to Display Editing with Vi” offers a complete description. Vi is more efficient and easier to use than ed and edit, because it shows you as many as 24 lines of text at once. The screen display provides a context for the line you are entering or changing. You can move the cursor around on the screen with arrows and with address commands. The command set available to you in vi is large and flexible, and a set of options allows you to tailor the editor to suit your needs. The article on vi is appropriate for beginners as well as expert ULTRIX-32 system users; it progresses from simple cursor positioning functions to sophisticated buffer filtering and macro facilities. ex Ex is a line editor, like edit and ed. However, ex offers a very large set of commands, options, and modes. In fact, edit and vi are modes (subsets) of ex. Ex is appropriate for novices as well as experienced users. However, the description of ex included here in the “Ex Reference Manual” is not a tutorial; it presents the rules that govern use of the editor and lists the commands and options alphabetically. Since edit is similar to but simpler than ex, you should find it helpful to read the article on edit first. The power and flexibility of ex make it the best editor for many applications. sed Sed, the stream editor, is an ULTRIX-32 system filter instead of an interactive editor. Sed can take its input either from the command line or from a script file (a file containing sed commands to be applied to the text file to be edited). It is most appropriate when used for editing functions that are repeated frequently as steps in a longer process, such as converting a list of users into a distribution list. The article “Sed - A Non-interactive Text Editor” provides a reference with explanations and examples of sed commands. If you already know ed, you have a head start on learning sed, since sed commands resemble ed commands. However, the interactive editors are easier to use and more practical in most cases than sed. Summary Most users choose vi to create and modify files. Ex, edit, and ed are good on slow phone lines and hard-copy terminals. Sed is best for experienced users with batch editing requirements. Edit — A Tutorial 3-3 Edit: A Tutorial Ricki Blau James Joyce Computing Services University of California Berkeley, California 94720 Introduction Text editing using a terminal connected to a computer allows you to create, modify, and print text easily. A text editor is a program that assists you as you create and modify text. The text editor you will learn here is named edit. Creating text using edit is as easy as typing it on an electric typewriter. Modifying text involves telling the text editor what you want to add, change, or delete. You can review your text by typing a command to print the file con- tents as they were entered by you. Another program, a text formatter, rearranges your text for you into “finished form.” This document does not discuss the use of a text formatter. These lessons assume no prior familiarity with computers or with text editing. They consist of a series of text editing sessions which lead you through the fundamental steps of creating and revising text. After scanning each lesson and before beginning the next, you should practice the examples at a terminal to get a feeling for the actual process of text editing. If you set aside some time for experimentation, you will soon become familiar with using the computer to write and modify text. In addition to the actual use of the text editor, other features of UNIX will be very important to your work. You can begin to learn about these other features by reading “Communicating with UNIX" or one of the other tutorials that provide a general introduction to the system. You will be ready to proceed with this lesson as soon as you are familiar with (1) your terminal and its special keys, (2) the login procedure, (3) and the ways of correcting typing errors. Let’s first define some terms: program A set of instructions, given to the computer, describing the sequence of steps the computer performs in order to accomplish a specific task. The tasks must be specific, such as balancing your checkbook or editing your text. A general task, such as working for world peace, is something we can do, but not something we can write programs to do. UNIX UNIX is a special type of program, called an operating system, that supervises the machinery and all other programs comprising the total computer system. edit edit is the name of the UNIX text editor you will be learning to use, and is a program that aids you in writing or revising text. KEdit was designed for beginning users, and is a simplified version of an editor named ex. Each UNIX account is allotted space for the permanent storage of information, such as programs, data or text. A file is a logical unit of data, for example, an essay, a program, or a chapter from a book, which is stored on a computer system. Once you create a file, it is kept until you instruct the system to remove it. You may create a file during one UNIX session, end the session, and return to use it at a later time. Files contain anything you choose to write and store in them. The sizes of files vary to suit your needs; one file might hold only a single S number, yet another might contain a very long document or program. et file £ e The only information from one session to the next is to store it in a file, whlch 3-4 Edit — A Tutorial filename Filenames are used to distinguish one file from another, serving the same purpose as the labels of manila folders in a file cabinet. In order to write or access information in a file, you use the name of that file in a UNIX command, and the system will automatically locate the file. disk Files are stored on an input/output device called a disk, which looks something like a stack of phonograph records. Each surface is coated with a material similar to the coating on magnetic recording tape, and information is recorded on it. buffer A temporary work space, made available to the user for the duration of a session of text editing and used for creating and modifying the text file. We can think of the buffer as a blackboard that is erased after each class, where each session with the editor is a class. Edit — A Tutorial 3-5 Session 1 Making contact with UNIX To use the editor you must first make contact with the computer by logging in to UNIX. We’ll quickly review the standard UNIX login procedure for the two ways you can make contact: on a terminal that is directly linked to the computer, or over a telephone line where the computer answers your call. , Directly-linked terminals Turn on your terminal and press the RETURN key. You are now ready to login. Dial-up terminals If your terminal connects with the computer over a telephone line, turn on the terminal, dial the system access number, and, when you hear a high-pitched tone, place the receiver of the telephone in the acoustic coupler. You are now ready to login. Logging in The message inviting you to login is: :login: Type your login name, which identifies you to UNIX, on the same line as the login message, and press RETURN. If the terminal you are using has both upper and lower case, be sure you enter your login name in lower case; otherwise UNIX assumes your terminal has only upper case and will not recognize lower case letters you may type. UNIX types “:login:” and you reply with your login name, for example “susan”: :login: susan (and press the RETURN key) (In the examples, input you would type appears in bold face to distinguish it from the responses from UNIX.) UNIX will next respond with a request for a password as an additional precaution to prevent unauthorized people from using your account. The password will not appear when you type it, to prevent others from seeing it. The message is: Password: (type your password and press RETURN) If any of the information you gave during the login sequence was mistyped or incorrect, UNIX will respond with Login incorrect. :login: in which case you should start the login process anew. Assuming that you have successfully logged in, UNIX will print the message of the day and eventually will present you with a % at the beginning of a fresh line. The % is the UNIX prompt symbol which tells you that UNIX is ready to accept a command. Asking for edit You are ready to tell UNIX that you want to work with edit, the text editor. convenient time to choose a name for the file of text you are about to create. Now is a To begin your editing session, type edit followed by a space and then the filename you have selected; for example, “text”. When you have completed the command, press the RETURN key and wait for edit’s response: LA O v 3-6 Edit — A Tutorial (followed by a RETURN) % edit text ”text” No such file or directory If you typed the command correctly, you will now be in communication with edit. Edit has set aside a buffer for use as a temporary working space during your current editing session. It also checked to see if the file you named, “text”, already existed. It was unable to find such a file, since “text” is a new file we are about to create. Edit confirms this with the line: "text” No such file or directory On the next line appears edit’s prompt ‘“:”, announcing that you are in command mode and edit expects a command from you. You may now begin to create the new file. The “Command not found” message If you misspelled edit by typing, say, “editor”, your request would be handled as follows: % editor editor; Command not found %o Your mistake in calling edit “editor” was treated by UNIX as a request for a program named “editor”. Since there is no program named ‘“editor”, UNIX reported that the program was “not found”. A new % indicates that UNIX is ready for another command, and you may then enter the correct command. A summary Your exchange with UNIX as you logged in and made contact with edit should look something like this: :login: susan Password: ... A Message of General Interest ... % edit text ”text” No such file or directory Entering text You may now begin entering text into the buffer. This is done by appending (or adding) text to whatever is currently in the buffer. Since there is nothing in the buffer at the moment, you are appending text to nothing; in effect, since you are adding text to nothing you are creating text. Most edit commands have two forms: a word that suggests what the command does, and a shorter abbreviation of that word. Either form may be used. Many beginners find the full command names easier to remember at first, but once you are familiar with editing you may prefer to type the shorter abbreviations. The command to input text is “append”, and it may be abbreviated “a”. Type append and press the RETURN key. W RAAW VN LB W Messages from edit If you make a mistake in entering a command and type something that edit does not recognize, edit will respond with a message intended to help you diagnose your error. For example, if you misspell the command to input text by typing, perhaps, “add” instead of “append” or “a”, you will receive this message: € 99 Edit — A Tutorial 3-7 radd add: Not an editor command When you receive a diagnostic message, check what you typed in order to determine what part of your command confused edit. The message above means that edit was unable to recognize your mistyped command and, therefore, did not execute it. Instead, a new *“:” appeared to let you know that edit is again ready to execute a command. Text input mode By giving the command ‘“append” (or using the abbreviation “a”), you entered text input mode, also known as append mode. ing you a prompt. mode. When you enter text input mode, edit stops send- You will not receive any prompts or error messages while in text input You can enter pretty much anything you want on the lines. The lines are transmitted one by one to the buffer and held there during the editing session. You may append as much text as you want, and when you wish to stop entering text lines you should type a period as the only character on the line and press the RETURN key. When you type the period and press RETURN, you signal that you want to stop appending text, and edit responds by allowing you to exit text input mode and reenter command mode. Edit will again prompt you for a - command by printing “:”. Leaving append mode does not destroy the text in the buffer. You have to leave append mode to do any of the other kinds of editing, such as changing, adding, or printing text. If you type a period as the first character and type any other character on the same line, edit will believe you want to remain in append mode and will not let you out. As this can be very frustrating, be sure to type only the period and the RETURN key. This is a good place to learn an important lesson about computers and text: a blank space 1s a character as far as a computer is concerned. If you so much as type a period fol- lowed by a blank (that is, type a period and then the space bar on the keyboard), you will remain in append mode with the last line of text being: Let’s say that the lines of text you enter are (try to type exactly what you see, including “thiss”): This is some sample text. And thiss is some more text. Text editing is strange, but nice. The last line is the period followed by a RETURN that gets you out of append mode. Making corrections If you have read a general introduction to UNIX, such as “Communicating with UNIX”, you will recall that it is possible to erase individual letters that you have typed. This is done by typing the designated erase character as many times as there are characters you want to erase. The usual erase character is the backspace (control-H), and you can correct typing errors in the line you are typing by holding down the CTRL key and typing the “H” key. If you try typing control-H you will notice that the terminal backspaces in the line you are on. You can backspace over your error, and then type what you want to be the rest of the line. If you make a bad start in a line and would like to begin again, you can either backspace to the beginning of the line or you can use the at-sign “@?” to erase everything on the line: ASASLEESD EF A VAT LY AV, 3-8 Edit — A Tutorial Text edtiing is strange, but @ Text editing is strange, but nice. When you type the at-sign (@), you erase the entire line typed so far and are given a fresh line to type on. You may immediately begin to retype the line. This, unfortunately, does not help after you type the line and press RETURN. To make corrections in lines that have been completed, it is necessary to use the editing commands covered in the next session and those that follow. Writing text to disk You are now ready to edit the text. The simplest kind of editing is to write it to disk as a file for safekeeping after the session is over. This is the only way to save information from one session to the next, since the editor’s buffer is temporary and will last only until the end of the editing session. Learning how to write a file to disk is second in importance only to entering the text. To write the contents of the buffer to a disk file, use the command “write” (or its abbreviation “w”’): :write Edit will copy the contents of the buffer to a disk file. If the file does not yet exist, a new file will be created automatically and the presence of a “[New file]” will be noted. The newlycreated file will be given the name specified when you entered the editor, in this case “text”. To confirm that the disk file has been successfully written, edit will repeat the filename and give the number of lines and the total number of characters in the file. The buffer remains unchanged by the “write” command. All of the lines that were written to disk will still be in the buffer, should you want to modify or add to them. Edit must have a filename to use before it can write a file. If you forgot to indicate the name of the file when you began the editing session, edit will print No current filename in response to your write command. If this happens, you can specify the filename in a new write command: :write text After the “write” (or “w”), type a space and then the name of the file. Signing off We have done enough for this first lesson on using the UNIX text editor, and are ready to quit the session with edit. To do this we type “quit” (or “q”) and press RETURN: :write "text” [New file] 3 lines, 90 characters cquit % The % is from UNIX to tell you that your session with edit is over and you may command UNIX further. Since we want to end the entire session at the terminal, we also need to exit from UNIX. In response to the UNIX prompt of “ % " type the command % logout This will end your session with UNIX, and will ready the terminal for the next user. It is always important to type logout at the end of a session to make absolutely sure no one could accidentally stumble into your abandoned session and thus gain access to your files, tempting ATV +]’\I’\ i 2aVaY + 1 4+ €VeEen Tne Mmost nonest o1£ SGu}S. This is the end of the first session on UNIX text editing. Edit — A Tutorial 3-9 Session 2 Login with UNIX as in the first session: :login: susan (carriage return) Password: (give password and carriage return) ... A Message of General Interest ... % When you indicate you want to edit, you can specify the name of the file you worked on last time. This will start edit working, and it will fetch the contents of the file into the buffer, so that you can resume editing the same file. When edit has copied the file into the buffer, it will repeat its name and tell you the number of lines and characters it contains. Thus, % edit text *text” 3 lines, 90 characters means you asked edit to fetch the file named “text” for editing, causing it to copy the 90 char- acters of text into the buffer. prompt character, the colon (:). Edit awaits your further instructions, and indicates this by its In this session, we will append more text to our file, print the contents of the buffer, and learn to change the text of a line. Adding more text to the file If you want to add more to the end of your text you may do so by using the append command to enter text input mode. When “append” is the first command of your editing session, the lines you enter are placed at the end of the buffer. Here we’ll use the abbreviation for the append command, “a”: ‘a This is text added in Session 2. It doesn’t mean much here, but it does 1llustrate the editor. You may recall that once you enter append mode using the “a” (or “append”) command, you need to type a line containing only a period (.) to exit append mode. ' Interrupt Should you press the RUB key (sometimes labelled DELETE) while working with edit, it will send this message to you: Interrupt Any command that edit might be executing is terminated by rub or delete, causing edit to prompt you for a new command. If you are appending text at the time, you will exit from append mode and be expected to give another command. The line of text you were typing when the append command was interrupted will not be entered into the buffer. Making corrections If while typing the line you hit an incorrect key, recall that you may delete the incorrect character or cancel the entire line of input by erasing in the usual way. Refer either to the last few pages of Session 1 or to “Communicating with UNIX” if you need to review the pro- cedures for making a correction. The most important idea to remember is that erasing a character or cancelling a line must be done before you press the RETURN key. 3-10 Edit — A Tutorial Listing what’s in the buffer (p) Having appended text to what you wrote in Session 1, you might want to see all the lines in the buffer. To print the contents of the buffer, type the command: :1,8p The “1”F stands for line 1 of the buffer, the “$”’ is a special symbol designating the last line of the buffer, and “p” (or print) is the command to print from line 1 to the end of the buffer. The command “1,$p” gives you: | This is some sample text. And thiss is some more text. Text editing is strange, but nice. This is text added in Session 2. It doesn’t mean much here, but it does illustrate the editor. Occasionally, you may accidentally type a character that can’t be printed, which can be done by striking a key while the CTRL key is pressed. In printing lines, edit uses a special notation to show the existence of non-printing characters. Suppose you had introduced the non- printing character “control-A” into the word “illustrate” by accidently pressing the CTRL key while typing “a”. This can happen on many terminals because the CTRL key and the “A” key are beside each other. If your finger presses between the two keys, control-A results. asked to print the contents of the buffer, edit would display When | it does illustr"Ate the editor. To represent the control-A, edit shows ‘“"A”. The sequence “””’ followed by a capital letter stands for the one character entered by holding down the CTRL key and typing the letter which appears after the “”’. We’ll soon discuss the commands that can be used to correct this typing error. In looking over the text we see that ‘““this” is typed as “thiss” in the second line, a deliberate error so we can learn to make corrections. Let’s correct the spelling. Finding things in the buffer ~ In order to change something in the buffer we first need to find it. We can find *“thiss” in the text we have entered by looking at a listing of the lines. Physically speaking, we search the lines of text looking for “thiss’” and stop searching when we have found it. The way to tell edit to search for something is to type it inside slash marks: :/thiss/ By typing /thiss/ and pressing RETURN, you instruct edit to search for “thiss”. If you ask edit to look for a pattern of characters which it cannot find in the buffer, it will respond “Pattern not found”. inspection: When edit finds the characters “thiss”, it will print the line of text for your | And thiss is some more text. Edit is now positioned in the buffer at the line it just printed, ready to make a change in the line. TThe numeral “one” is the top left-most key, and should not be confused with the letter “el”. Edit — A Tutorial 3-11 The current line Edit keeps track of the line in the buffer where it is located at all times during an editing session. In general, the line that has been most recently printed, entered, or changed is the current location in the buffer. The editor is prepared to make changes at the current location in the buffer, unless you direct it to another location. In particular, when you bring a file into the buffer, you will be located at the last line in the file, where the editor left off copying the lines from the file to the buffer. If your first editing command is “append”, the lines you enter are added to the end of the file, after the current line — the last line in the file. You can refer to your current location in the buffer by the symbol period () usually known by the name “dot”. If you type “.” and carriage return you will be instructing edit to print the current line: And thiss is some more text. If you want to know the number of the current line, you can type .= and press RETURN, and edit will respond with the line number: 2 If you type the number of any line and press RETURN, edit will position you at that line and print its contents: | : 2 And thiss is some more text. You should experiment with these commands to gain experience in using them to make changes. Numbering lines (nu) The number (nu) command is similar to print, giving both the number and the text of each printed line. To see the number and the text of the current line type ‘nu 2 And thiss is some more text. Note that the shortest abbreviation for the number command is “nu” (and not “n”, which is used for a different command). You may specify a range of lines to be listed by the number command in the same way that lines are specified for print. For example, 1,$nu lists all lines in the buffer with their corresponding line numbers. Substitute command (s) Now that you have found the misspelled word, you can change it from “thiss” to *“this”. As far as edit is concerned, changing things is a matter of substituting one thing for another. As a stood for append, so s stands for substitute. We will use the abbreviation “s” to reduce the chance of mistyping the substitute command. This command will instruct edit to make the change: 2s/thiss/this/ We first indicate the line to be changed, line 2, and then type an €6 S .99 to indicate we want edit to make a substitution. Inside the first set of slashes are the characters that we want to change, followed by the characters to replace them, and then a closing slash mark. To summarize: | 3-12 Edit — A Tutorial 2s/ what is to be changed / what to change it to / If edit finds an exact match of the characters to be changed it will make the change only in the first occurrence of the characters. If it does not find the characters to be changed, it will respond: Substitute pattern match failed indicating that your instructions could not be carried out. When edit does find the characters that you want to change, it will make the substitution and automatically print the changed line, so that you can check that the correct substitution was made. In the example, :2s/thiss/this/ And this 1s some more text. line 2 (and line 2 only) will be searched for the characters “thiss”, and when the first exact match is found, “thiss” will be changed to “this”. Strictly speaking, it was not necessary above to specify the number of the line to be changed. In -s/thiss/this/ edit will assume that we mean to change the line where we are currently located (.”). In this case, the command without a line number would have produced the same result because we were already located at the line we wished to change. For another illustration of the substitute command, let us choose the line: Text editing is strange, but nice. | You can make this line a bit more positive by taking out the characters “strange, but ” so the line reads: Text editing is nice. A command that will first position edit at the desired line and then make the substitution is: :/strange/s/strange, but // What we have done here is combine our search with our substitution. Such combinations are perfectly legal, and speed up editing quite a bit once you get used to them. not necessarily have to use line numbers to identify a line to edit. That is, you do Instead, you may identify the line you want to change by asking edit to search for a specified pattern of letters that occurs in that line. The parts of the above command are: /strange/ tells edit to find the characters “strange” in the text S tells edit to make a substitution /strange, but // substitutes nothing at all for the characters “strange, but ” You should note the space after “but” in “/strange, but /. If you do not indicate that the space is to be taken out, your line will read: Text editing is nice. which looks a little funny because of the extra space between “is” and “nice”. Again, we real- ize from this that a blank space is a real character to a computer, and in editing text we need to be aware of spaces within a line just as we would be aware of an “a” or a “4”. Another way to list what’s in the buffer (z) Although the print command is useful for looking at specific lines in the buffer, other commands may be more convenient for viewing large sections of text. screen full of text at a time by using the command z. If you type You can ask to see a Edit — A Tutorial 3-13 1z edit will start with line 1 and continue printing lines, stopping either when the screen of your terminal is full or when the last line in the buffer has been printed. If you want to read the next segment of text, type the command . Z If no starting line number is given for the z command, printing will start at the “current” line, in this case the last line printed. Viewing lines in the buffer one screen full at a time is known as paging. Paging can also be used to print a section of text on a hard-copy terminal. Saving the modified text This seems to be a good place to pause in our work, and so we should end the second session. If you (in haste) type “q” to quit the session your dialogue with edit will be: €¢ .33 -q lo write since last change (:quit! overrides) This is edit’s warning that you have not written the modified contents of the buffer to disk. You run the risk of losing the work you did during the editing session since you typed the latest write command. Because in this lesson we have not written to disk at all, everything we have done would have been lost if edit had obeyed the q command. If you did not want to save the work done during this editing session, you would have to type “q!” or (“quit!”) to confirm that you indeed wanted to end the session immediately, leaving the file as it was after the most recent “write” command. However, since you want to save what you have edited, you need to type: W "text” 6 lines, 171 characters and then follow with the commands to quit and logout: -q % logout and hang up the phone or turn off the terminal when UNIX asks for a name. Terminals connected to the port selector will stop after the logout command, and pressing keys on the keyboard will do nothing. This is the end of the second session on UNIX text editing. 3-14 KEdit — A Tutorial Session 3 Bringing text into the buffer (e) Login to UNIX and make contact with edit. You should try to login without looking at the notes, but if you must then by all means do. Did you remember to give the name of the file you wanted to edit? type That is, did you % edit text or simply % edit Both ways get you in contact with edit, but the first way will bring a copy of the file named “text” into the buffer. If you did forget to tell edit the name of your file you can get it into the buffer by typing: e text “text” 6 lines, 171 characters The command edit, which may be abbreviated e, tells edit that you want to erase anything that might already be in the buffer and bring a copy of the file “text” into the buffer for edit- ing. You may also use the edit (¢) command to change files in the middle of an editing session, or to give edit the name of a new file that you want to create. Because the edit command clears the buffer, you will receive a warning if you try to edit a new file without having saved a copy of the old file. This gives you a chance to write the contents of the buffer to disk before editing the next file. Moving text in the buffer (m) Edit allows you to move lines of text from one location in the buffer to another by means of the move (m) command. The first two examples are for illustration only, though after you have read this Session you are welcome to return to them for practice. The command :2,4m$ directs edit to move lines 2, 3, and 4 to the end of the buffer ($). The format for the move command is that you specify the first line to be moved, the last line to be moved, the move command “m”, and the line after which the moved text is to be placed. So, :1,3m6 would instruct edit to move lines 1 through 3 (inclusive) to a location after line 6 in the buffer. To move only one line, say, line 4, to a location in the buffer after line 5, the command would be “4mb5”. Let’s move some text using the command: :5,$m1 2 lines moved it does illustrate the editor. -1 After executing a command1 that moves more than one line of the buffer, edit tells how many lines were affected by the move and prints the last moved line for your inspection. If you want to see more than just the last line, you can then use the print (p), z, or number (nu) command to view more text. The buffer should now contain: Edit — A Tutorial 3-15 This is some sample text. It doesn’t mean much here, but it does illustrate the editor. And this is some more text. Text editing is nice. This is text added in Session 2. You can restore the original order by typing: :4,$m1l or, combining context searching and the move command: -/And this is some/,/This is text/m/This is some sample/ (Do not type both examples here!) The problem with combining context searching with the move command is that your chance of making a typing error in such a long command is greater than if you type line numbers. , Copying lines (copy) The copy command is used to make a second copy of specified lines, leaving the original lines where they were. Copy has the same format as the move command, for example: :2,5copy $ makes a copy of lines 2 through 5, placing the added lines after the buffer’s end ($). Experiment with the copy command so that you can become familiar with how it works. Note that the shortest abbreviation for copy is co (and not the letter “c”, which has another meaning). Deleting lines (d) Suppose you want to delete the line This is text added in Session 2. from the buffer. If you know the number of the line to be deleted, you can type that number followed by delete or d. This example deletes line 4, which is “This is text added in Session 2.” if you typed the commands suggested so far. :4d | It doesn’t mean much here, but Here “4” is the number of the line to be deleted, and “delete” or “d” is the command to delete the line. After executing the delete command, edit prints the line that has become the current line (‘.”). If you do not happen to know the line number you can search for the line and then delete it using this sequence of commands: -/added in Session 2./ This is text added in Session 2. d It doesn’t mean much here, but The “/added in Session 2./” asks edit to locate and print the line containing the indicated text, starting its search at the current line and moving line by line until it finds the text. Once you are sure that you have correctly specified the line you want to delete, you can enter the delete (d) command. In this case it is not necessary to specify a line number before the “d”. If no line number is given, edit deletes the current line (“.”), that is, the line found by our search. After the deletion, your buffer should contain: 3-16 Edit — A Tutorial This is some sample text. And this is some more text. Text editing is nice. It doesn’t mean much here, but it does illustrate the editor. And this is some more text. Text editing is nice. This is text added in Session 2. It doesn’t mean much here, but To delete both lines 2 and 3: And this is some more text. Text editing is nice. you type :2,3d 2 lines deleted which specifies the range of lines from 2 to 3, and the operation on those lines — “d” for If you delete more than one line you will receive a message telling you the number of delete. lines deleted, as indicated in the example above. The previous example assumes that you know the line numbers for the lines to be If you do not you might combine the search command with the delete command: deleted. :/And this is some/,/Text editing is nice./d A word or two of caution In using the search function to locate lines to be deleted you should be absolutely sure the characters you give as the basis for the search will take edit to the line you want deleted. Edit will search for the first occurrence of the characters starting from where you last edited — that is, from the line you see printed if you type dot (.). A search based on too few characters may result in the wrong lines being deleted, which edit will do as easily as if you had meant it. For this reason, it is usually safer to specify the search and then delete in two separate steps, at least until you become familiar enough with using the editor that you understand how best to specify searches. For a beginner it is not a bad idea to double-check each command before pressing RETURN to send the command on its way. Undo (u) to the rescue The undo (u) command has the ability to reverse the effects of the last command that changed the buffer. To undo the previous command, type “u” or “undo”. Undo can rescue the contents of the buffer from many an unfortunate mistake. However, its powers are not unlimited, so it is still wise to be reasonably careful about the commands you give. It is possible to undo only commands which have the power to change the buffer — for example, delete, append, move, copy, substitute, and even undo itself. The commands write (w) and edit (e), which interact with disk files, cannot be undone, nor can commands that do not change the buffer, such as print. Most importantly, the only command that can be reversed by undois the last “undo-able” command you typed. You can use control-H and @ to change commands while you are typing them, and undo to reverse the effect of the commands after you have typed them and pressed RETURN. To illustrate, let’s issue an undo command. Recall that the last buffer-changing command we gave deleted the lines formerly numbered 2 and3. Typing undo at this moment will reverse the effects of the deletion, causing those two lines to be replaced in the buffer. Edit — A Tutorial 3-17 ‘U 2 more lines in file after undo And this is some more text. Here again, edit informs you if the command affects more than one line, and prints the text of the line which is now “dot” (the current line). More about the dot (.) and buffer end ($) The function assumed by the symbol dot depends on its context. It can be used: 1. to exit from append mode; we type dot (and only a dot) on a line and press RETURN; 2. to refer to the line we are at in the buffer. Dot can also be combined with the equal sign to get the number of the line currently being edited: If we type “.=” we are asking for the number of the line, and if we type “.” we are asking for 66 3) the text of the line. In this editing session and the last, we used the dollar sign to indicate the end of the buffer in commands such as print, copy, and move. The dollar sign as a command asks edit to print the last line in the buffer. If the dollar sign is combined with the equal sign ($=) edit will print the line number corresponding to the last line in the buffer. “” and “$”, then, represent line numbers. Whenever appropriate, these symbols can be used in place of line numbers in commands. For example o 9d instructs edit to delete all lines from the current line (.) to the end of the buffer. Moving around in the buffer (+ and —) When you are editing you often want to go back and re-read a previous line. You could specify a context search for a line you want to read if you remember some of its text, but if you simply want to see what was written a few, say 3, lines ago, you can type This tells edit to move back to a position 3 lines before the current line (.) and print that line. You can move forward in the buffer similarly: +2p instructs edit to print the line that is 2 ahead of your current position. You may use “+” and “—” in any command where edit accepts line numbers. Line numbers specified with “+” or “~" can be combined to print a range of lines. The command :—1,+2copy$ makes a copy of 4 lines: the current line, the line before it, and the two after it. The copied lines will be placed after the last line in the buffer ($), and the original lines referred to by “—~1” and “+2” remain where they are. Try typing only “—”’; you will move back one line just as if you had typed “—1p”. Typing the command “+” works similarly. You might also try typing a few plus or minus signs in 59 a row (such as “+++”) to see edit’s response. Typing RETURN alone on a line is the equivalent of typing “+1p”; it will move you one line ahead in the buffer and print that line. If you are at the last line of the buffer and try to move further ahead, perhaps by typing a “4” or a carriage return alone on the line, edit will remind you that you are at the end of the buffer: 3-18 Edit — A Tutorial At end-of-file or Not that many lines in buffer Similarly, if you try to move to a position before the first line, edit will print one of these messages: Nonzero address required on this command or Negative address — first buffer line is 1 The number associated with a buffer line is the line’s “address”, in that it can be used to locate the line. Changing lines (¢) You can also delete certain lines and insert new text in their place. This can be accomplished easily with the change (¢) command. The change command instructs edit to delete specified lines and then switch to text input mode to accept the text that will replace them. Let’s say you want to change the first two lines in the buffer: This is some sample text. And this is some more text. to read This text was created with the UNIX text editor. To do so, you type: :1,2¢ 2 lines changed This text was created with the UNIX text editor. In the command 1,2¢ we specify that we want to change the range of lines beginning with 1 and ending with 2 by giving line numbers as with the print command. These lines will be deleted. After you type RETURN to end the change command, edit notifies you if more than one line will be changed and places you in text input mode. Any text typed on the following lines will be inserted into the position where lines were deleted by the change command. You will remain in text input mode until you exit in the usual way, by typing a period alone on a line. Note that the number of lines added to the buffer need not be the same as the number of lines deleted. This 1s the end of the third session on text editing with UNIX. Edit — A Tutorial 3-19 Session 4 This lesson covers several topics, starting with commands that apply throughout the buffer, characters with special meanings, and how to issue UNIX commands while in the editor. The next topics deal with files: more on reading and writing, and methods of recovering files lost in a crash. The final section suggests sources of further information. Making commands global (g) One disadvantage to the commands we have used for searching or substituting is that if you have a number of instances of a word to change it appears that you have to type the command repeatedly, once for each time the change needs to be made. Edit, however, provides a way to make commands apply to the entire contents of the buffer — the global (g) command. To print all lines containing a certain sequence of characters (say, ‘“text”) the command 1S: . g/text/p The “g” instructs edit to make a global search for all lines in the buffer containing the charac- ters “text”. The “p” prints the lines found. To issue a global command, start by typing a “g” and then a search pattern identifying the lines to be affected. Then, on the same line, type the command to be executed for the identified lines. Global substitutions are frequently useful. For example, to change all instances of the word “text” to the word “material” the command would be a combination of 66 %) the global search and the substitute command: . g/text/s/text/material/g Note the “g” at the end of the global command, which instructs edit to change each and every instance of “text” to “material”. If you do not type the “g” at the end of the command only the first instance of “text” in each line will be changed (the normal result of the substitute command). The “g” at the end of the command is independent of the “g” at the beginning. You may give a command such as: -Bs/text/material/g to change every instance of “text” in line 5 alone. Further, neither command will change “text” to “material” if “Text” begins with a capital rather than a lower-case ¢. Edit does not automatically print the lines modified by a global command. If you want the lines to be printed, type a “p” at the end of the global command: - g/text/s/text/material/gp You should be careful about using the global command in combination with any other — in essence, be sure of what you are telling edit to do to the entire buffer. For example, g/ /d 72 less lines in file after global will delete every line containing a blank anywhere in it. This could adversely affect your document, since most lines have spaces between words and thus would be deleted. After executing the global command, edit will print a warning if the command added or deleted more than one line. Fortunately, the undo command can reverse the effects of a global command. You should experiment with the global command on a small file of text to see what it can do for you. 3-20 Edit — A Tutorial More about searching and substituting In using slashes to identify a character string that we want to search for or change, we have always specified the exact characters. string of characters. There is a less tedious way to repeat the same To change “text” to “texts” we may type either /text/s/text/texts/ as we have done in the past, or a somewhat abbreviated command: /text/s//texts/ In this example, the characters to be changed are not specified — there are no characters, not even a space, between the two slash marks that indicate what is to be changed. This lack of characters between the slashes is taken by the editor to mean ‘“use the characters we last searched for as the characters to be changed.” Similarly, the last context search may be repeated by typing a pair of slashes with nothing between them: | :/does/ It doesn’t mean much here, but :// it does illustrate the editor. (You should note that the search command found the characters “does” in the word “doesn’t” in the first search request.) Because no characters are specified for the second search, the edi- tor scans the buffer for the next occurrence of the characters “does”. Edit normally searches forward through the buffer, wrapping around from the end of the buffer to the beginning, until the specified character string is found. If you want to search in the reverse direction, use question marks (?) instead of slashes to surround the characters you are searching for. It is also possible to repeat the last substitution without having to retype the entire com- mand. An ampersand (&) used as a command repeats the most recent substitute command, using the same search and replacement patterns. After altering the current line by typing :s/text/texts/ you type /text/& of simply /& to make the same change on the next line in the buffer containing the characters “text”. Special characters Two characters have special meanings when used in specifying searches: “$” and “*”. “$” is taken by the editor to mean “end of the line” and is used to identify strings that occur at the end of a line. e eb o [y PR B N :g/text.p/s//material./p tells the editor to search for all lines ending in “text.” (and nothing else, not even a blank space), to change each final “text.” to “material.”, and print the changed lines. The symbo 1 §ENDY) indicates the beginning of a line. Thus, :s/°/1. / instructs the editor to insert “1.” and a space at the beginning of the current line. Edit — A Tutorial 3-21 The characters “$” and “*” have special meanings only in the context of searching. At other times, they are ordinary characters. If you ever need to search for a character that has a special meaning, you must indicate that the character is to lose temporarily its special significance by typing another special character, the backslash (\), before it. -s/\$/dollar/ looks for the character “$” in the current line and replaces it by the word “dollar”. Were it not for the backslash, the “$” would have represented “the end of the line” in your search rather than the character “$”. The backslash retains its special significance unless it 1s preceded by another backslash. Issuing UNIX commands from the editor After creating several files with the editor, you may want to delete files no longer useful to you or ask for a list of your files. Removing and listing files are not functions of the editor, and so they require the use of UNIX system commands (also referred to as “shell” commands, as “shell” is the name of the program that processes UNIX commands). You do not need to quit the editor to execute a UNIX command as long as you indicate that it is to be sent to the shell for execution. To use the UNIX command rm to remove the file named “junk” type: Irm junk 1 The exclamation mark (!) indicates that the rest of the line is to be processed as a shell command. If the buffer contents have not been written since the last change, a warning will be printed before the command is executed: [No write since last change] The editor prints a “!” when the command is completed. The tutorial “Communicating with UNIX” describes useful features of the system, of which the editor is only one part. Filenames and file manipulation Throughout each editing session, edit keeps track of the name of the file being edited as the current filename. Edit remembers as the current filename the name given when you entered the editor. The current filename changes whenever the edit (e) command is used to specify a new file. Once edit has recorded a current filename, it inserts that name into any command where a filename has been omitted. If a write command does not specify a file, edit, as we have seen, supplies the current filename. If you are editing a file named “draft3” having 283 lines in it, you can have the editor write onto a different file by including its name in the write command: :w chapter3 ?chapter3” [new file] 283 lines, 8698 characters The current filename remembered by the editor will not be changed as a result of the write command. Thus, if the next write command does not specify a name, edit will write onto the current file (‘““‘draft3”) and not onto the file “chapter3”. The file (f) command To ask for the current filename, type file (or f). In response, the editor provides current information about the buffer, including the filename, your current position, the number of lines in the buffer, and the percent of the distance through the file your current location is. If the contents of the buffer have changed since the last time the file was written, the editor 3-22 Edit — A Tutorial will tell you that the file has been “[Modified]”. After you save the changes by writing onto a disk file, the buffer will no longer be considered modified: W "text” 4 lines, 88 characters £ "text” line 3 of 4 --75% -- Reading additional files (r) The read (r) command allows you to add the contents of a file to the buffer at a specified location, essentially copying new lines between two existing lines. To use it, specify the line after which the new text will be placed, the read (r) command, and then the name of the file. If you have a file named “example”’, the command :$r example “example” 18 lines, 473 characters reads the file “example” and adds it to the buffer after the last line. The current filename is not changed by the read command. Writing parts of the buffer - The write (w) command can write all or part of the buffer to a file you specify. We are already familiar with writing the entire contents of the buffer to a disk file. To write only part of the buffer onto a file, indicate the beginning and ending lines before the write com- mand, for example :45,9w ending Here all lines from 45 through the end of the buffer are written onto the file named ending. The lines remain in the buffer as part of the document you are editing, and you may continue to edit the entire buffer. buffer to another file. Your original file is unaffected by your command to write part of the Edit still remembers whether you have saved changes to the buffer in your original file or not. Recovering files Although it does not happen very often, there are times UNIX stops working because of some malfunction. This situation is known as a crash. Under most circumstances, edit’s crash recovery feature is able to save work to within a few lines of changes before a crash (or an accidental phone hang up). If you lose the contents of an editing buffer in a system crash, you will normally receive mail when you login that gives the name of the recovered file. To recover the file, enter the editor and type the command recover (rec), followed by the name of the lost file. For example, to recover the buffer for an edit session involving the file “chap6”’, the command is: | :recover chap6 Recover is sometimes unable to save the entire buffer successfully, so always check the contents of the saved buffer carefully before writing it back onto the original file. For best results, write the buffer to a new file temporarily so you can examine it without risk to the original file. Unfortunately, you cannot use the recover command to retrieve a file you removed using the shell command rm. Other recovery techniques If something goes wrong when you are using the editor, it may be possible to save your work by using the command preserve (pre), which saves the buffer as if the system had crashed. If you are writing a file and you get the message “Quota exceeded”, you have tried to use more disk storage than is allotted to your account. Proceed with caution because it is Edit — A Tutorial 3-23 likely that only a part of the editor’s buffer is now present in the file you tried to write. In this case you should use the shell escape from the editor (!) to remove some files you don’t need and try to write the file again. If this is not possible and you cannot find someone to help you, enter the command :preserve and wait for the reply, File preserved. If you do not receive this reply, seek help immediately. Do not simply leave the editor. If you do, the buffer will be lost, and you may not be able to save your file. If the reply is “File preserved.” you can leave the editor (or logout) to remedy the situation. After a preserve, you can use the recover command once the problem has been corrected, or the —r option of the edit command if you leave the editor and want to return. If you make an undesirable change to the buffer and type a write command before discovering your mistake, the modified version will replace any previous version of the file. Should you ever lose a good version of a document in this way, do not panic and leave the editor. As long as you stay in the editor, the contents of the buffer remain accessible. Depending on the nature of the problem, it may be possible to restore the buffer to a more complete state with the undo command. After fixing the damaged buffer, you can again write the file to disk. | Further reading and other information Edit is an editor designed for beginning and casual users. It is actually a version of a more powerful editor called ex. These lessons are intended to introduce you to the editor and its more commonly-used commands. We have not covered all of the editor’s commands, but a selection of commands that should be sufficient to accomplish most of your editing tasks. You can find out more about the editor in the Ex Reference Manual, which is applicable to both ex and edit. The manual is available from the Computing Services Library, 218 Evans Hall. One way to become familiar with the manual is to begin by reading the description of commands that you already know. Using ex As you become more experienced with using the editor, you may still find that edit continues to meet your needs. However, should you become interested in using ex, it is easy to switch. To begin an editing session with ex, use the name ex in your command instead of edit. Edit commands work the same way in ex, but the editing environment is somewhat different. You should be aware of a few differences that exist between the two versions of the editor. In edit, only the characters “*”, “$”, and “\” have special meanings in searching the buffer or indicating characters to be changed by a substitute command. Several additional characters have special meanings in ex, as described in the Ex Reference Manual. Another feature of the edit environment prevents users from accidently entering two alternative modes of editing, open and visual, in which the editor behaves quite differently from normal command mode. If you are using ex and the editor behaves strangely, you may have accidently entered open mode by typing “o0”. Type the ESC key and then a “Q” to get out of open or visual mode and back into the regular editor command mode. The document An Introduction to Display Editing with Vi provides a full discussion of visual mode. A Tutorial Introduction to the UNIX Text Editor 3-25 A Tutorial Introduction to the UNIX Text Editor Brian W. Kernighan Bell Laboratories Murray Hill, New Jersey 07974 Introduction Ed is a “text editor”, that is, an interactive program for creating and modifying “text”, using directions provided by a user at a terminal. The text is often a document like this one, or a program or perhaps data for a program. This introduction is meant to simplify learning ed. The recommended way to learn ed is to read this document, simultaneously using ed to follow the examples, then to read the description in section I of the UNIX Programmer’s Manual, all the while experimenting with ed. (Solicita- tion of advice from experienced users is also useful.) (followed by a return) You are now ready to go — ed is waiting for you to tell it what to do. Creating Text — the Append command “a” As your first problem, suppose you want to create some text starting from scratch. Perhaps you are typing the very first draft of a paper; clearly it will have to start somewhere, and undergo modifications later. This section will show how to get some text in, just to get started. Later we’ll talk about how to change it. When ed is first started, it is rather like Do the exercises! They cover material not completely discussed in the actual text. appendix summarizes the commands. An no text or information present. This is an introduction and a tutorial. For than a part of the facilities that ed offers (although this fraction includes the most useful and frequently used parts). When you have mastered the Tutorial, try Advanced Editing on there is not enough explain basic UNIX procedures. space This must be supplied by the person using ed; it is usually ed from a file. this reason, no attempt is made to cover more Also, working with a blank piece of paper — there is done by typing in the text, or by reading it into Disclaimer UNIX. ed to We will assume We will start by typing in some text, and return shortly to how to read files. First a bit of terminology. In ed jargon, the text being worked on is said to be ‘“kept in a buffer.” Think of the buffer as a work space, if you like, or simply as the information that you are going to be editing. In effect the buffer is like the piece of paper, on which we will write things, then change some of them, and finally file the whole thing away for another day. that you know how to log on to UNIX, and that you have at least a vague understanding of what a file is. For more on that, read UNIX for typing Beginners. commands consist of a single letter, which must You must also know what character to type as the end-of-line on your particular terminal. This character is the RETURN key on most terminals. Throughout, we will refer to this character, whatever it is, as RETURN. Getting Started We’ll assume that you have logged in to your system and it has just printed the prompt character, usually either a $ or a %. The easiest way to get ed is to type The user tells ed what to do to his text by instructions called be typed in lower case. on a separate line. “commands.” Most Each command is typed (Sometimes the command is preceded by information about what line or lines of text are to be affected — we will discuss these shortly.) Ed makes no response to most commands — there is no prompting or typing of messages like “ready”. (This silence is preferred by experienced users, but sometimes a hangup for beginners.) Mhe command ie annend written ac N 1ne fraet 1irst comimana 1is appena, written as the 3-26 A Tutorial Introduction to the UNIX Text Editor w junk a all by itself. It means “append (or add) text Leave a space between w and the file name. Ed lines to the buffer, as I type them in.” Append- will respond by printing the number of charac- ing is rather like writing fresh material on a ters it wrote out. piece of paper. with So to enter lines of text into the buffer, just type an a followed by a RETURN, followed by the lines of text you want, like this: In this case, ed would respond 68 (Remember that blanks and the return character at the end of each line are included in the char- a acter count.) Writing a file just makes a copy of Now is the time the text — for all good men the buffer’s contents are not dis- turbed, so you can go on adding lines to it. to come to the aid of their party. is an important point. a copy of a file, not the file itself. The only way to stop appending is to type a line that contains only a period. to tell ed that you have The “.” is used finished appending. (Even experienced users forget that terminating “.” sometimes. If ed seems to be ignoring you, type an extra line with just “.” on it. You may then find you’ve added some garbage lines to your text, which you’ll have to take out later.) | No change in the contents of a file takes place until you give a w command. (Writing out the text onto a file from time to time as it is being created is a good idea, since if the system crashes or if you make some horrible mistake, you will lose all the text in the buffer but any text that was written onto a file is relatively safe.) Leaving ed — the Quit command “q” After the append command has been done, the buffer will contain the three lines This Ed at all times works on To terminate a session with ed, save the text you’re working on by writing it onto a file using the w command, and then type the command Now is the time for all good men q to come to the aid of their party. The “a” and “.” aren’t there, because they are not text. To add more text to what you already have, just issue another a command, and continue typ- which stands for quit. The system will respond with the prompt character ($ or %). At this point your its buffer vanishes, with all text, which is why you want to write it out before quitting. ing. Exercise 1: Error Messages — “?” Enter ed and create some text using If at any time you make an error in the com- a mands you type to ed, it will tell you by typing | ... text ... ? This is about as cryptic as it can be, but with Write it out using w. practice, you command, and print the file, to see that every- can usually figure out how you goofed. thing worked. or It’s likely that you’ll want to save your text for later use. (To print a file, say pr filename Writing text out as a file — the Write command “w” Then leave ed with the g cat filename To write out the contents of the buffer onto a file, use the write command in response to the prompt character. Try both.) w followed by the filename you want to write on. This will copy the buffer’s contents onto the specified file (destroying any previous information on the file). To save the text on a file named junk, for example, type T Actually, ed will print ? if you try to quit without writing. At that point, write if you want; if not, another q will get you out regardless. A Tutorial Introduction to the UNIX Text Editor 3-27 Reading text from a file — the Edit com- e junk mand “e”’ A common way to get text into the buffer is to read it from a file in the file system. This is what you do to edit text that you saved with the w command in a previous session. The edit r junk the buffer will contain two copies of the text (six lines). Now is the time command e fetches the entire contents of a file into the buffer. So if you had saved the three lines “Now is the time”, etc., with a w command in an earlier session, the ed command for all good men to come to the aid of their party. Now is the time for all good men to come to the aid of their party. e junk would fetch the entire contents of the file junk into the buffer, and respond Like the w and e commands, r prints the number of characters read in, after the reading operation is complete. 68 Generally speaking, r is much less used than which is the number of characters in junk. If anything was already in the buffer, it is deleted e. first. Exercise 2: Experiment with the e command — try read- If you use the e command to read a file into the buffer, then you need not use a file name ing and printing various files. after a subsequent w command; ed remembers the last file name used in an e command, and w will write on this file. Thus a good way to error 7name, where name is the name of a file; this means that the file doesn’t exist, typically operate is perhaps that you are not allowed to read or ed e file because you spelled the write it. q f ed would reply junk Reading text from a file — the Read command “r” Sometimes you want to read a file into the buffer without destroying anything that 1is already there. This is done by the read command r. The command r junk will read the file junk into the buffer; it adds it to the end of whatever is already in the buffer. So if you do a read after an edit: or Try alternately reading and appending is exactly equivalent to ed e filename What does writing into the proper file each time. You can find out at any time what file name ed is remembering by typing the file command f. In this example, if you typed wrong, ed filename This way, you can simply say w from time to time, and be secure in the knowledge that if you got the file name right at the beginning, you are name to see that they work similarly. Verify that [editing session] w file You may get an f filename do? Printing the contents of the buffer — the Print command “p” To print or list the contents of the buffer (or parts of it) on the terminal, use the print command p The way this is done is as follows. Specify the lines where you want printing to begin and where you want it to end, separated comma, and followed by the letter p. by a Thus to print the first two lines of the buffer, for example, (that is, lines 1 through 2) say 1,2p (starting line=1, ending line=2 p) Ed will respond with Now is the time for all good men 3-28 A Tutorial Introduction to the UNIX Text Editor Suppose you want to print all the lines in the buffer. You could use 1,3p as above if you knew there were exactly 3 lines in the buffer. But in general, you don’t know how many there are, so what do you use for the ending line number? FEd provides a shorthand symbol for “line number of last line in buffer” — the dollar sign $. Use it this way: The current line — “Dot” or “.” Suppose 1,3p and ed has printed the three lines for you. This will print all the lines in the buffer (line 1 to last line.) If you want to stop the printing before it is finished, push the DEL or Delete key; ed will type Try typing just p 1,$p your buffer still contains the six lines as above, that you have just typed (no line numbers) This will print to come to the aid of their party. which is the third line of the buffer. In fact it is the last (most recent) line that you have done ? anything with. (You just printed it!) You can repeat this p command without line numbers, and wait for the next command. and it will continue to print line 3. To print the last line of the buffer, you could use The reason is that ed maintains a record of the last line that you did anything to (in this case, line 3, which you just printed) so that it $,$p can be used instead of an explicit line number. but ed lets you abbreviate this to This most recent line is referred to by the short- hand symbol $p (pronounced ‘“dot”). You can print any single line by typing the line number followed by a p. Thus Dot is a line number in the same way that $ is; it means exactly “the current line”, or loosely, 1p “the line you most recently did something to.” ~ produces the response You can use it in several ways — one possibility is to say Now is the time 9P which is the first line of the buffer. In fact, ed lets you abbreviate even further: This will print all the lines from (including) the you can print any single line by typing just the current line to the end of the buffer. line number — no need to type the letter p. example these are lines 3 through 6. So if you say Some commands change the value of dot, while others do not. $ mand will set both . and $ to 6. You can also use $ in combinations like Dot 1s most useful when used in combinations like this one: $—1,%p which prints the last two lines of the buffer. This helps when you want to see how far you got in typing. .+1 (or equivalently, .+1p) This means “print the next line” and is a handy way to step slowly through a buffer. As before, create some text using the a comexperiment You can also say Exercise 3: and The p command sets dot to the number of the last line printed; the last com- ed will print the last line of the buffer. mand In our with the p command. .~1 (or .—1p) which means “print the line before the current You will find, for example, that you can’t print line.” This enables you to go backwards if you line O or a line beyond the end of the buffer, and wish. Another useful one is something like that attempts to print a buffer in reverse order by saying —3,.—1p which prints the previous three lines. Don’t forget value of dot. that all of these change the You can find out what dot is at any time by typing A Tutorial Introduction to the UNIX Text Editor 3-29 you can insert a file at the beginning of a buffer by saying Ed will respond by printing the value of dot. Or filename Let’s summarize some things about the p command and dot. Essentially p can be pre- ceded by 0, 1, or 2 line numbers. If there is no line number given, it prints the “current line”, the line that dot refers to. and you can enter lines at the beginning of the buffer by saying Oa If there is one line ... text ... number given (with or without the letter p), it prints that line (and dot is set there); and if there are two line numbers, it prints all the lines in that range Notice that .w is very different from (and sets dot to the last line printed.) If two line numbers are specified the first can’t be bigger than the second (see Exer| cise 2.) Typing a single return will cause printing of the next line — it’s equivalent to .+1p. Try it. Try typing a —; you will find that it’s equivalent to .~ 1p. Modifying text: the Substitute command 66 S 9% We are now ready to try one of the most important of all commands — the substitute command Deleting lines: the “d” command S Suppose you want to get rid of the three extra lines in the buffer. This is done by the delete command | This is the command that is used to change individual words or letters within a line or group of lines. d It is what you use, for example, for correcting spelling mistakes and typing errors. Suppose that by a typing error, line 1 says Except that d deletes lines instead of printing them, its action is similar to that of p. The lines Now is th time to be deleted are specified for d exactly as they — the e has been left off the. are for p: You can use s to fix this up as follows: starting line, ending line d 1s/th/the/ Thus the command This says: “in line 1, substitute for the charac- 4,%d ters th the characters the.” To verify that it deletes lines 4 through the end. There are now three lines left, as you can check by using works (ed will not print the result automatically) say p 1,$p And notice that $ now is line 3! Dot is set to the next line after the last line deleted, unless the last line deleted is the last line in the buffer. In that case, dot is set to $. and get Now is the time which is what you wanted. Notice that dot must have been set to the line where the substitution took place, since the p command printed that Exercise 4: Experiment with a, e, r, w, p and d until you are sure that you know what they do, and until you understand how dot, $, and line you are Dot is always set this way with the s com- mand. The general way to use the substitute command is numbers are used. If line. adventurous, try numbers with a, r and w as well. using line You will find that a will append lines after the line number that you specify (rather than after dot); that r reads a file in after the line number you specify (not necessarily at the end of the buffer); and that w will write out exactly the lines you specify, not necessarily the whole buffer. These variations are sometimes handy. For instance starting-line, ending-line s/change this/to this/ Whatever string of characters is between the first pair of slashes is replaced by whatever is between the second pair, in all the lines between starting-line and occurrence on SRR ending-line. F i IR, R Only R the first each line 1is cuauged, however. AN . If 'l you want to change every occurrence, see Lxer- cise 5. The rules for line numbers are the same 3-30 A Tutorial Introduction to the UNIX Text Editor as those for p, except that dot is set to the last all occurrences by adding a g (for “global”) to line the s command, like this: changed. (But there is a trap for the unwary: if no substitution took place, dot is not s/.../.../gp changed. This causes an error ? as a warning.) Try other characters instead of slashes to delimit Thus you can say the two sets of characters in the s command — anything should work except blanks or tabs. 1,$s/speling/spelling/ and correct the first spelling mistake on each line in the text. (This is useful for people who (If you get funny results using any of the characters are consistent misspellers!) o8 If no line numbers are given, the s command assumes we mean “make the substitution on line dot”, so it changes things only on the current line. This leads to the very common sequence read the section on “Special Characters”.) Context searching — “/ .../ With the substitute command mastered, you s/something/something else/p can move on to another highly important idea of which makes some correction on the current line, ed — context searching. and then prints it, to make sure it worked out right. If it didn’t, you can try again. (Notice that there is a p on the same line as the s command. Suppose you have the original three line text in the buffer: Now is the time With few exceptions, p can follow any command; no other multi-command [\ & lines for all good men are to come to the aid of their party. legal.) It’s also legal to say Suppose you want to find the line that contains their so you can change it to the. Now with only three lines in the buffer, it’s pretty easy to s/...// which means ‘“change the first string of charac- keep track of what line the word their is on. if the buffer contained several hundred ters to “nothing”, i.e., remove them. This is use- But ful for deleting extra words in a line or removing lines, and you’d been making changes, deleting and rearranging lines, and so on, you would no longer really know what this line number would extra letters from words. For instance, if you had be. Nowxx is the time Context searching is simply a method of specifying the desired line, regardless of what its number is, by specifying some context on it. you can say The way to say “search for a line that contains this particular string of characters” is to s/xx//p to get type Now is the time /string of characters we want to find/ Notice that // (two adjacent slashes) means “no characters”, not a blank. There is a difference! (See below for another meaning of //.) /their/ is a context search which is sufficient to find the desired line — it will locate the next occurrence Exercise B5: Experiment For example, the ed command with the substitute command. of the characters between slashes (‘“their”). See what happens if you substitute for some It also sets dot to that line and prints the line for word on a line with several occurrences of that verification: For example, do this: to come to the aid of their party. a “Next occurrence” means that ed starts looking the other side of the coin for the string at line .+1, searches to the end of the buffer, then continues at line 1 and searches s/the/on the/p to line dot. on the other side of the coin "‘" FaY -~ ~ ~i A bUbbblbUbC A omymand changes only the first occurrence of the firs (That is, the search “wraps around” from $ to 1.) It scans all the lines in the buffer You will get string. You can change until it either finds the desired line or gets back to dot again. If the given 1 i 3 string of characters b-d word. yo / < A Tutorial Introduction to the UNIX Text Editor 3-31 1,$p ? Otherwise it prints the line it found. but not if there were several hundred.) You can do both the search for the desired The basic rule is: a context search expression is the same as a line number, so it can be used line and a substitution all at once, like this: wherever a line number is needed. /their/s/their/the/p Exercise 6: which will yield Experiment with context searching. to come to the aid of the party. body of text with several There were three parts to that last command: same string of characters, and scan through it context search for the desired line, make the using the same context search. substitution, print the line. Try using context searches as line numbers The expression /their/ is a context search expression. In their simplest form, all context search expressions are like this — a string of characters surrounded by slashes. Context searches are interchangeable with line numbers, so they can be used by themselves to find and print a desired line, or as line numbers for some other command, like s. They were used both ways in the examples above. Suppose the buffer contains the three familiar lines for the substitute, print and delete commands. (They can also be used with r, w, and a.) Try context searching using ?text? of /text/. scans lines instead in the buffer in This is some- times useful if you go too far while looking for some string of characters — it’s an easy way to back up. (If you get funny results with any of the characters o8 [\ & read the section on “Special Characters”.) for all good men to come to the aid of their party. Ed provides a shorthand for repeating a context search for the same string. Then the ed line numbers For example, the ed line number /Now/+1 /good/ /string/ /party/—1 will find the next occurrence of string. are all context search expressions, and they all refer to the same This reverse order rather than normal. Now is the time - Try a occurrences of the line (line 2). To make a It often happens that this is not the desired line, so the search must be repeated. change in line 2, you could say typing merely /Now/+1s/good/bad/ // This can be done by This shorthand stands for ‘“the most recently or used context search expression.” It can also be /good/s/good/bad/ used as the first string of the substitute command, as in ~or /string1/s//string2/ /party/—1s/good/bad/ The choice is dictated only by convenience. You could print all three lines by, for instance and replace it by string2. of typing. /Now/,/party/p This can save a lot Similarly ?? or means ‘“scan backwards for the same expres- /Now/,/Now/+2p sion.” or by any number of similar combinations. The first one of these might be better if you don’t know how many lines are involved. which will find the next occurrence of stringl (Of course, Change and Insert — 66,99 C and 6399 1 This section discusses the change command if there were only three lines in the buffer, you'd use S hanoe nange or oOr renlace repiace a a o~ oroiin group one or more lines, and the insert command of oOi 3-32 A Tutorial Introduction to the UNIX Text Editor line-number a i ... text ... which is used for inserting a group of one or more lines. appends after the given line, while “Change”, written as line-number 1 C is ... text ... used to replace a number of lines with different lines, which are typed in at the termi- nal. For example, to change lines .+1 through $ to something else, type inserts before it. is given, i Observe that if no line number inserts before line dot, while a appends after line dot. +1,8¢ ... type the lines of text you want here . . . Moving text around: the “m” command The move command m is used for cutting The lines you type between the ¢ command and and pasting — it lets you move a group of lines the . from one place to another in the buffer. will take the place of the original lines between start line and end line. This is most useful in replacing a line or several lines which buffer at the end instead. have errors in them. saying: If only one line is specified in the ¢ command, then just that line is replaced. $r temp type In as many replacement lines as you like.) to end the input — this works just like the . in the append command and must appear by itself on a new line. If no line number is given, line dot is replaced. The value of dot is set to the last line you typed in. You could do it by 1,3w temp (You can Notice the use of . Sup- pose you want to put the first three lines of the 1,3d (Do you see why?) but you can do it a lot easier with the m command: 1,3m$ The general case is “Insert” is similar to append — for instance start line, end line m after this line /string/i ... type the lines to be inserted here . . Notice that there is a third line to be specified — . the place where the moved stuff gets put. will insert the given text before the next line that contains “string”. The text between i and . is inserted before the specified line. number is specified dot is used. Of course the lines to be moved can be specified by context searches; if you had First paragraph If no line Dot is set to the end of first paragraph. last line inserted. Second paragraph Exercise 7: “Change” is rather like delete followed by insert. a combination end of second paragraph. of Experiment to verify that you could reverse the two paragraphs like this: /Second/,/end of second/m/First/—1 start, end d Notice the —1: the moved text goes after the line mentioned. .. text . Dot gets set to the last line moved. The global commands “g” and “v” start, end ¢ The global command g is used to execute ... text . .. one or more ed commands on all those lines in the buffer that match some specified string. These are not precisely the same if line $ gets deleted. g/peling/p Check this out. What is dot? Experiment with a and i, to see that they are similar, but not the same. You will observe that For example prints all lines that contain peling. fully, More use- A Tutorial Introduction to the UNIX Text Editor 3-33 [“string/ g/peling/s//pelling/gp makes the substitution everywhere on the line, then prints each corrected line. Compare this to finds string only if it is at the beginning of a line: it will find string 1,$s/peling/pelling/gp only prints the last line substituted. Another subtle differenceis that the g command does not give a ? if peling is not found where but not the s command will. The dollar-sign $ is just the opposite of the There may be several commands (including a, ¢, i, r, w, but not g); in that case, every line except the last must end with a backslash x circumflex; it means the end of a line: which g/xxx/.-1s/abc/def/n the string... /string$/ will only find an occurrence of string that is at the end of some line. .+2s/ghi/jkl/n -2,.p makes changes in the lines before and after each line that contams xxXx, then prints all three lines. ["string$/ will find only a line that contains just string, and The v command 1s the same as g, except that the commands are executed on every line that does not match the string following v: v/ /d /".$/ finds a line containing exactly one character. The although the cure is simple. Basically, ed treats these characters as special, with special meanings. For instance, in a context search or the first string of the substitute command only, . means ‘‘any character,” not a period, so we mentioned above, X+y X-y XYy X.y This is useful in conjunction with *, which is a repetition character; a* is a shorthand for “any number of a’s,” so .* matches any number of anythings. This is used like this: s/.*/stuff/ /x.y/ means “a line with an x, any character, and a y,” not just “a line with an x, a period, and a y.” A complete list of the special characters that can cause trouble is the following: [ * \ which changes an entire line, or s/.*,// which deletes all characters in the line up to and including the last comma. (Since .* finds the longest possible match, this goes up to the last ed. For safety’s sake, avoid it where possible. If you have to use one of the special characters in a comma.) [ is used with ] to form “character classes’; for example, substitute command, you can turn off its magic /[0123456789]/ meaning temporarily by preceding it with the backslash. Thus matches any single digit — any one of the char- acters inside the braces will cause a match. This R cnaraciers. TVt 1rirst, the beginning of a line. 4l uile Atsemrivnfioy © cionihac circumflex Thus mguuwb Finally, the & is another shorthand character — it is used only on the right-hand part of a CL Here is a hurried synopsus of the other spe- can be abbreviated to [0—9]. w will change \.* into ‘‘backslash dot star’’ <4 s/\\\.\*/backslash dot star/ ca LlluU\J mat i*\ Nn tA% l\./u RIX an ubstitute comman g vuo to save typing. D $ Warning: The backslash character \is special to | R, N as /x.y/ You may have noticed that things just don’t work right when you used some characters like ., * §, and others in context searches and the substitute command. The reason is rather complex, Clal ., matches any of Special Characters . character matches anything; deletes every line that does not contain a blank. A This implies, of course, that where it meanb “whatever left-hand side”. -li It is L4 used VAT W/ B Suppose the current line con- 3-34 A Tutorial Introduction to the UNIX Text Editor a: Append, that is, add lines to the buffer (at tained line Now is the time and you wanted to put parentheses around it. You could just retype the line, but this is tedious. unless a different line is specified). line. Dot is set to the last line appended. c: Change the specified lines to the new text Or you could say which follows. The new lines are terminated by s/*/(/ s/$/)/ a ., as with a. If no lines are specified, replace line dot. using your knowledge of " and $. But the easiest way uses the &: d: Dot is set to last line changed. Delete the lines specified. specified, delete line dot. If none are Dot is set to the first undeleted line, unless $ is deleted, in which case dot is set to $. s/.*/(&)/ This says “match the whole line, and replace it by itself surrounded by parentheses.” The & can be used several times in a line; consider using e: Edit new file. Any previous contents of the buffer are thrown away, so issue a w beforehand. f: Print remembered filename. s/ */&? &!!/ If a name follows f the remembered name will be set to it. to produce g: The command Now is the time? Now is the time!! g/---/commands You don’t have to match the whole line, of course: if the buffer contains will execute the commands on those lines that contain ---, which can be any context search expression. the end of the world i: Insert lines before specified line (or dot) until you could type a . is typed on a new line. Dot is set to last line inserted. /world/s//& is at hand/ m: Move lines specified to after the line named to produce after m. Dot is set to the last line moved. the end of the world is at hand p: Print specified lines. Observe this expression carefully, for it illustrates how to take advantage of ed to save typing. dot, Appending continues until . is typed on a new The string /world/ found the desired line; line dot. If none specified, print A single line number is equivalent to line-number p. A single return prints .+1, the next line. the shorthand // found the same word in the q: Quit ed. line; and the & saves you from typing it again. give it twice in a row without first giving a w The & is a special character only within the Wipes out all text in buffer if you command. replacement text of a substitute command, and r: Read a file into buffer (at end unless specified has no special meaning elsewhere. elsewhere.) Dot set to last line read. You can turn off the special meaning of & by preceding it with a\ will s/string1/string2/ s/ampersand/\é’{/ convert the word s: The command “ampersand” into the literal symbol & in the current line. substitutes the characters stringl into string2 in the specified lines. If no lines are specified, make the substitution in line dot. last line in which a substitution Dot is set to took place, which means that if no substitution took place, Summary of Commands and Line Numbers The dot is not changed. them, type a g after the final slash. general form of ed commands is the command name, perhaps preceded by one or two v: The command v/---/commands line numbers, and, in the case of e, r, and w, followed by a file name. s changes only the first occurrence of stringl on a line; to change all of Only one command is allowed per line, but a p command may follow any other command (except for e, r, w and q). executes commands on those lines that do not contain =---. w: Write out buffer onto a file. changed. Dot is not A Tutorial Introduction to the UNIX Text Editor 3-35 .=: Print value of dot. (= by itself prints the value of $.) !I: The line lcommand-line causes command-line to be executed as a UNIX command. [ommnm /: Context search. Search for next line which contains this string of characters. it. Print Dot is set to the line where string was found. Search starts at .+1, wraps around from $ to 1, and continues to dot, if necessary. P ?. Context search iIn reverse direction. Start search at .—1, scan to 1, wrap around to $. Advanced Editing on UNIX 3-37 Advanced Editing on UNIX Brian W. Kernighan Bell Laboratories Murray Hill, New Jersey 07974 i. they INTRODUCTION Although unNixt provides remarkably effective tools for text editing, that by itself is no guarantee that evcrybne will automatically make the most effective use of them. will remain theoretical knowledge, The List command ‘I’ ed provides two commands for printing the In particular, people who are not computer specialists — typ- contents of the lines you're editing. ists, secretaries, casual users are familiar with p, in combinations like -— often use the system less effectively than they might. Tutoriai (ntroduction (0 the UNIX Text Editor [1], to print all the lines you're editing, or providing explanations and examples of how to edit with less effort. (You should also be familiar with the material in UNIX For Beginners [2].) Further information on all commands discussed The UNIX Programmer’s Manual {3]. Examples are based on observations users and the difficulties they encounter. covered include special characters and substitute commands, in of Topics searches line addressing, the global commands, and line moving and copying. There are also brief discussions of effective use of related tools, like those for file manipulation, and those based on ed, like grep and sed. A word of caution. There is only one way to learn to use something, and that is to wuse it. Reading a description is no substitute for trying something. A paper like this one should give you ideas about what to try, but until you actually try something, you will not learn it. 2. The editor ed is the primary interface to the system for many people, so it is worthwhile to know how to get the most out of ed for the least effort. to change ‘abc’ to ‘def” on the current line. Less familiar is the /ist command | (the letter ‘/°), which gives slightly more information than p. In particular, are | makes wvisible characters that normally invisible, such as tabs and backspaces. If you list a line that contains some of these, | will print each tab as 3 and each backspace as <. This makes it much easier to correct the sort of typing mistake that inserts extra spaces adja- cent to tabs, or inserts a backspace followed by a space. The | command also ‘folds’ long lines for printing — any line that exceeds 72 characters is printed on multiple lines; each printed line except the last is terminated by a backslash \, so you can tell it was folded. This is useful for printing long lines on short terminals. Occasionally the | command will print in a such as \07 or \16. These combinations are used to make visible characters that normally don't print, like form feed or vertical tab or bell. Each such combination is a single character. When have surprising meanings when printed on some next few sections will shortcuts and labor-saving devices. discuss Not ail of these will be instantly useful to any one person, of course, s/abc/def/p line a string of numbers preceded by a backslash, SPECIAL CHARACTERS The Most people 1,5p This document is intended as a sequel to 4 here can be found in not something you have confidence in. but a few will be, and terminals. Often their presence means that your finger slipped while you were typing, you almost never want them. the others should give you ideas to store away for future And as always, until you trv these things, ! = of the substutute command ®) 1 o - n O . h (o8 -y £ (58 tUNIX is a Trademark of Bell Laboratories. ext few sections will be taken © use. Since this is the command for changing the 3-38 Advanced Editing on UNIX contents of individual lines, it probably has the The Metacharacter *.’ most complexity of any ed command, and the most potential for effective use. As you have undoubtedly noticed when you use ed, certain characters have unexpected As the simplest place to begin, recall the meanings when they occur in the left side of a meaning of a trailing g after a substitute com- substitute command, or in a search for a particu- mand. lar line. With In the next several sections, we will talk about these special characters, which are often s/this/that/ called ‘metacharacters’. and The first one is the period ‘.. the first one replaces the first ‘this’ on the line with ‘that’. the line, If there is more than one ‘this’ on the second form with the trailing ‘/..0°, . stands for any single character. /x.y/ finds any line where ‘x’ and ‘y’ occur separated Either form of the s command can be fol- by a single character, as in lowed by p or | to ‘print’ or ‘list’ (as described in X+y the previous section) the contents of the line: X=y s/this/that/p Xay s/this/that/| Xy s/this/that/gp and so on. s/this/that/gl (We will use o to stand for a space whenever we need to make it visible.) are all legal, and mean slightly different things. Make sure you know what the differences are. Of course, any Since ‘.’ matches a single character, that gives you a way to deal with funny characters s command can be pre- ceded by one or two ‘line numbers’ to specify printed by . that the substitution is to take place on a group th\07is and you want to get rid of the \07 (which represents the bell character, by the way). 1,8s/mispell/misspell/ first occurrence Suppose you have a line that, when printed with the | command, appears as of lines. Thus the Thus the search g changes all of them. changes On the left side of a substitute command, or in a search with s/this/that/g of ‘misspell’ on every line of the file. ‘mispell’ to The most obvious solution-is to try But s/\07// 1,8s/mispell/misspell/g changes every occurrence in every line (and this is more likely to be what you wanted in this par- ticular case). but this will fail. (Try it.) The brute force solution, which most people would now take, is to re-type the entire line. This is guaranteed, and is actually quite a reasonable tactic if the line in You should also notice that if you add a p question isn’t too big, but for a very long line, or | to the end of any of these substitute com- re-typing is a bore. This is where the metachar- mands, only the last line that got changed will be acter handy. printed, not all the lines. represents a single character, if we say We will talk later about how to print all the lines that were modified. Occasionally you will make a substitution in a line, only to realize too late that it was a The ‘undo’ command be restored in Since ‘\07' really s/th.is/this/ The ‘.” A matches the mysterious character between the ‘h' and the ‘i’, wharever i . Bear in u lets you ‘undo’ the last substitution: the last line that was substituted can comes the job is done. The Undo Command ‘u’ ghastly mistake. ‘.’ o fn SIRIC abev:mesrabme mind LildIdLiCT, ble oo UIC that since m . *.” ? matches any o sam 3 m o COTTITIdIIU to its previous state by typing the command converts the first character on a line into a ‘., ® u which very often is not what you intended. As is true of many characters in ed, the & b has several meanings, depending on its context. This line shows all three: Advanced Editing on UNIX tains a backslash. S/ The first *." is a line number, the number of the line we are editing, which is called ‘line dot’. (We will discuss line dot more in Section 3.) The second ‘." is a metacharacter that matches any single character on that line. The third ‘.’ is the only one that really is an honest literal period. 6 On the right side of a substitution, cial. ‘." % is not spe- won't work, because the *\’ isn’t a literai '\", but instead means that the second '/’ no longer delimits the search. But by preceding a backslash with you can search for a literal backslash. Thus does work. Similarly, you can search for a for- ward slash '/’ with NI/ oW IS the time. The which is probably not what you intended. backslash turns off the meaning of the immediately following ‘/° so that it doesn’t terminate the /.../ construction prematurely. The Backslash *\’ Since a period means ‘any character’, the question naturally arises of what to do when you really want a period. For example, how do you As an exercise, before reading further, find two substitute commands each of which will convert the line \x\.\y convert the line into the line Now is the time. \x\y into Now is the time? The backslash Here are several solutions; verify that each ‘\' does the job. A backslash works as advertised. turns off any special meaning that the next char- s/\\\.// acter might have; in particular, ‘\." converts the 9 another one, /\\/ the result will be . The search /\/ If you apply this command to the line Now is the time. & 3-39 e [ from a ‘match anything’ o S/Xeo/ X/ . into a period, so you s/ ..yly/ can use it to replace the period in A Now is the time. couple backslashes like this: and of miscellaneous special notes characters. about First, you can use any character to delimit the pieces of an s s/\./?/ command: slashes. there is nothing sacred about (But you must use slashes for context The pair of characters ‘\.’ is considered by ed to searching.) For instance, in a line that contains a be a single real period. lot of slashes aiready, like The backslash can also be used when searching for lines that contain a special character. Suppose you are looking for a line that contains The search as the delimiter — to if # and @ are your character \# and \@; this is true whether you’re talking to ed or any other program. .. THE APPLICATION OF When you are adding text with a or i or ¢ because the ‘. matches the letter *A’. But if you say backslash is not special, and you should only put in one backslash for each one you really want. The Dollar Sign 'S’ /\.PP/ The next metacharacter, the *'S’, stands for you will find only lines that contain '.PP". The backslash can also be used to tum off meanings for characters other than '.". For Edeplc consider unumg a line that con e use a colon Second, isn’t adequate, for it will find a line like o could delete all the slashes, type erase and line kill characters, you have to type / PP/ o you s:/:g PP special //exec //sys.fort.go // etc... o o e o = Py . ‘the end of the line'. As itls most obvious use, suppose you have the line 3-40 Advanced Editing on UNIX The other use of *TM" is of course to enable Now is the and you wish to add the word ‘time’ to the end. Use the § like this: you to insert something at the beginning of a line: s/ /qal s/8/-ume/ places a space at to get the beginning of the current line. Now is the time Metacharacters Notice that a space is needed before ‘time’ in the substitute command, or vou will get the following replace line a period The Star *«° Now is the time, for all good men, Suppose vou 1exr s/.S/./ The $ sign here provides context to make specific which comma we mean. Without it. of course, the operate would have on the Afrst comma to produce Now is the time. for all good men, X y the x and the y. x and y by a single space. The line is too long to retype, and there are too many spaces to count. What now? as [ ‘»° 3 comes A character followed by a star stands many consecutive occurrences of that To refer to all the spaces at once, say Now is the time? s/Xxg*y/xay/ as we did earlier, we can use The construction ‘g*’ means ‘as many spaces as s/.8/7/ possible”. *$ [exI Suppose the job is to replace all the spaces between for Into the looks like where rsext stands for lots of text, and there are character as possible. *.", that some indeterminate number of spaces between in handy. Now is the time. Like line This i1s where the metacharacter As another example, to convert depending on context. a this: The command needed is command To /"\.PPS/ the second with without altering the first: s combined. you can use the command As danother example, in be .PP Now is thetime comma can search for a line that contains onlv the characters has multiple meanings In the line 8s/8/8/ Thus ‘xg=*y’ means ‘an x, as many spaces as possible, then a y'. The star can not just space. be used with any character, If the original example was instead the first ‘S’ refers to the last line of the file, the second refers to the end of that line, and the third is a literal dollar sign, to be added to that line. The Circumiflex then If you stmply say /the/ vou will in all likelthood find several lines that contain ‘the’ in the middle before arriving at the / the/ But with be replaced by a single Finally, suppose that the line was For example, sup- pose vou are looking for a line that begins with one you want. *—" signs can S/IX—xy/xcy/ The circumflex (or hat or caret) "’ stands ‘the’. Y space with the command el for the beginning of the line. all n ,(J..\‘I x’.....“fi....fl‘fl“y you see unwary? what trap ,e.\‘, lies in wait Loo O sl | e [f you blindly type s/x.«y/x:y/ what will happen? it depends. the hne. The answer, naturally, 1s that If there are no other x's or v's on then evervthing works, but it’s blind luck. not good management. Remember that *. matches anv single character? Then “.«" matches as many single characters as possible, and unless Advanced Editing on UNIX you're careful, it can eat up a lot more of the line than you expected. [f the line was, for example, like this: ,()-Yl x ,(’o“, x‘....l..."...&fly ’(’-“I y 3-41 abcdef produces yaybycydyeyfy ,(,.Yl which is almost certainly not what was intended. then saying The reason for this behavior is that zero is a legal number of matches, and there are no x's at S/X.*y/Xay/ will take everything from the first ‘x’ to the /last ] ‘v', which, in this example, is undoubtedly more than you wanted. The solution. of course, is to turn off the special meaning of *." with ‘\.": ~ the beginning of the line (so that gets converted into a ‘'v'), nor between the ‘a’ and the ‘b’ (so that gets converted into 4 ‘y'), nor ... and so on. Make sure you really wdant zero matches; if not, in this case write s/xxs/y/g s/x\.ey/xay/ Now everything works, for ‘\.«' means ‘as many periods as possible’. ‘xx=' iS one or more x's. The Brackets | |’ There are times when the pattern ‘.¢' is exactly what you want. For example, to change Suppose of a file. Now is the time for all good men ... that you want to delete any numbers that appear at the beginning of all lines You might first think of trying a series of commands like into l 85/ 1=// Now is the time. l 857727/ 1 S5/ 3=// use ‘.+ to eat up everything after the ‘for’: and so on, but this is clearly going to take for- s/ ofor.+/./ ever if the numbers are at all long. There are a couple of additional pitfalls Unless you want to repeat the commands over and over until associated with ‘<’ that you should be aware of. finally all numbers are gone, you must get all the Most notable is the fact that ‘as many as possi- digits on one pass. ble’ means zero or more. The fact that zero is a legitimate possibility is sometimes rather surprising. The construction For example, if our line contained rexi Xy text X y [0123456789] rlext and we said matches any single digit — called a ‘character class’. With a character class, the job s/xa*y/xay/ the first ‘xy’ matches this pattern, for it consists the later one that actually contains specify a pattern like ter class, and just to confuse the issue there are essentially no special characters inside the brackeven meaning. x, a space, then as many more spaces as possible, then a y'. in other words, one Or more spaces. The other startling behavior of the fact ‘[0123456789]+ deletes all digits from the beginning of all lines. ets; /ch'y/ to pattern Any characters can appear within a charac- The way around this, if it matters, is to related The 1.8s/°[0123456789]=// some intervening spaces. which says ‘an easy. whole thing is o) The resuit is that the substitute acts on the first ‘xy’. and does touch is the matches zero or more digits (an entire number), of an ‘x’, zero spaces, and a 'y’. not This is the purpose of the brackets [ and ]. that zero is a [ ‘=" % is again legitimate s/xe/v/g when applied to the lin backslash To search doesn’t have a special for special characters, for example, you can say /NS (1/ Within [...]. the ‘[* is not special. To get a ‘I’ into a character class. make it the first character. number of occurrences of something tollowed by a star. The command the [t's digits. SO a4 nuisance you can to have abbreviate to spell them as out the [0—9]: similarly. {a—z] stands for the lower case letters. and (A —Z] for upper case. As a Ainal fnll on character classes, you can 3-42 Advanced Editing on UNIX Now is the time? Now is the time!! specify a class that means ‘none of the following characters’. This is done by beginning the class with a **": | [F0—9] stands for ‘any character excepr a digit’. Thus you might find the first line that doesn’t begin with a tab or space by a search like s/ampersand/\&/ converts the word into the symbol. Within a character class, the circumflex has a special meaning only if it occurs at the beginJust to convince yourself, verify that Substituting Newlines ed provides a facility for splitting a single line into two or more shorter lines by ‘substituting in a newline’. /7 (") pose finds a line that doesn’t begin with a circumflex. save typing. text ‘&’ is used primarily to Suppose you have the line substitute, the Of course this isn’t much of a saving if the thing matched is it is is something truly long or something like ‘.»’ which matches a lot of text, you can save some tedious typing. Bearing in meanings, mind it seems that *\’ relatively make the newline there no longer special. You can in fact make a single line into several lines with this same mechanism. As a large word example, consider underlining the ‘very' in a long line by splitting ‘very’ onto a formatting command ‘.ul’. and the ‘&’ will stand for ‘the’. if two lines. separate line, and preceding it by the roff or nroff s/the/& best/ or on On ampersand say awful, text intuitive that a *\’ at the end of a line would means ‘whatever was just matched’, so you can if it was you can break it between the ‘x’ and the ‘y’ like typed but it seems silly to have to repeat the ‘the’. but long it If it looks like turns off special The ‘&’ is used to eliminate the repetition. ‘the’, because This 1s actually a single command, although it is s/the/the best/ just Xy unmanageably (or merely y/ Of course you can always say of a gotten s/ xy/x\ Now is the best time side As the simplest example, sup- has this: and you want to make it rght line unwisely typed). Now is the time the a because of editing The Ampersand ‘&’ The ampersand Notice that ‘&’ is not special on the left side of a substitute, only on the right side. /" ["(space) (tab)}/ ning. To get a literal ampersand, naturally the backslash is used to turn off the special meaning;: There is also much less chance of mak- ing a typing error in the replacement text. For example, to parenthesize a line, regardless of its length, s/ .+/(&)/ The ampersand can occur more than once on the right side: s/the/& best and & worst/ makes Now i1s the best and the worst time text a very big text The command s/overya/\ ul\ very\ / converts the line into four shorter lines, preceding the word ‘very' by the line ‘.ul’, and eliminating the spaces around the ‘very’, all at the same time. When a newline is substituted in, dot is left pointing at the last line created. Joining Lines Lines may also be joined together, but this is done with the j command instead of s. and Given the lines S/ e/ &7 &'/ converts the original line into Now 1s ~the time and supposing that dot is set to the first of them, Advanced Editing on UNIX hope. then the command in J 3-43 The global commands g and v discussed section 4 provide a way for you to print exactly those lines which were affected by the joins them together. No blanks are added, which is why we carefully showed a blank at the begin- substitute command, and thus verify that it did what you wanted in all cases. ning of the second line. All by itself, a j command joins line dot to 3. The next general area we will discuss is line dot+1, but any contiguous set of lines can be joined. Just specify the starting and ending line numbers. LINE ADDRESSING IN THE EDITOR that of line addressing in ed, that is, how you specify what lines are to be affected by editing For example, commands. 1.Sjp We have already used constructions like joins all the lines into one big one and prints it. (More on line numbers in Section 3.) Rearranging a Line with \ to specify a change on all lines. And most users are long since familiar with using a single new- (... \) (This section should be skipped on first reading.) Recall that ‘&' is a shorthand that stands for whatever was matched by the left side of an s command. 1,Ss/x/y/ In much the same way you can capture separate pieces of what was matched; the only difference is that you have to specify on the left side just what pieces you're interested in. Suppose, for instance, that you have a file of lines that consist of names in the form line (or return) to print the next line, and with /thing/ to find a line that contains ‘thing’. Less familiar, surprisingly enough, is the use of ?thing? to scan backwards for the previous occurrence of ‘thing’. This is especially handy when you real- ize that the thing you want to operate on is back Smith, A. B. up the page from where you are currently edit- Jones, C. ing. and so on, and you want the initials to precede the name, as in The slash and question mark are the only characters you can use to delimit a context search, though you can use essentially any char- A. B. Smith acter in a substitute command. C. Jones It is possible to do this with a series of editing commands, but it is tedious and error-prone. is instructive to figure out how it is (It done, The The alternative is to ‘tag' the pieces of the pattern (in this case, the last name, and the initials), and then rearrange the pieces. On the left side of a substitution, if part of the pattern is enclosed between \( and \), whatever matched that part is remembered, and available for use on the right side. On the right side, the symbol ‘\I’ refers to whatever matched the first \(...\) pair, ‘\2" to the second \(...\), and so on. The command although hard to read, does the job. The first up to the comma; this is referred to on the right The second \(...\) is whatever comma is to combine the line Thus $—1 is a command to print the next to last line of the current file (that is, one line before line *‘S’). For example, to recall how far you got in a previous editing session, $§—5.5p prints the last six lines. (Be sure you understand why it’s six, If there aren't six, of not five.) As another example, \(...\) matches the last name, which is any string the step course, you'll get an error message. 1,8s/"\([",]*\),a =\ (.\)/\2a\ 1/ side with ‘\1'. next numbers like *.’, ‘%", */.../" and *?...7" with ‘+' and ‘—'. though.) follows Address Arithmetic and any spaces, and is referred to as ‘\2’ Of course, with any editing sequence this complicated, it's foolhardy to simply run it and .~3,.+3p prints from three lines before where you are now (at line dot) to three lines after, thus giving you a bit of context. omitted: By the way. the "+ can be 3-44 Advanced Editing on UNIX Another area in which you can save lyping got matched: effort in specifying lines is to use ‘=" and '+ as //s//&a&/p line numbers by themselves. finds the next occurrence of searched for last, replaces it whatever you by two copies of by itself is a command to move back up one line itself, then prints the line just to verify that it in the file. worked. In fact, you can string several minus signs together to move back up that many lines: e oD Default Line Numbers and the Value of Dot e moves up three lines, as does ‘“—3'. One of the most effective ways to speed up Thus your editing i1s always to know what lines will be affected bv a command if you don’t specify the —-3.+3p lines it is to act on, and on what line you will be is also identical to the examples above. positioned (i.e., the value of dot) when a com- Since ‘=" is shorter than *.— 1", construc- If you can edit without specifying typing. —,.5/bad/good/ are useful. This changes ‘bad’ to ‘good’ on the previous line and on the current line. ‘4" and ‘= can be used in combination with searches using ‘/.../" and '?...7", and with ‘$’. mand finishes. unnecessary line numbers, you can save a lot of tions like As the most obvious example, if you issue a search command like /thing/ vou are left pointing at the next line that contains ‘thing’. The search Then no address is required with commands like s to make a substitution on that /thing/ — — line, or p to print it, or | to list it, or d to delete finds the line containing ‘thing’, and positions it, or a to append text after it, or ¢ to change it, you two lines before it. or i to insert text before it. Repeated Searches Then you are left right where you were — dot is What happens if there was no ‘thing'? unchanged. Suppose you ask for the search This is also true if you were sitting on the only ‘thing’ when you issued the com- /horrible thing/ mand. The same rules hold for searches that use and when the line is printed you discover that it *2...7" the isn't the horrible thing that you wanted, so it is which you search. necessary to repeat the search again. You don't have to re-type the search, for the construction is a shorthand for ‘the previous thing that was for’, whatever it was. This repeated as many times as necessary. can be You can also go backwards: The delete command d leaves dot pointing at the new line ‘$’. The line-changing commands a, ¢ and i by default all affect the current line — if you give no line number with them, a appends text after a, searches for the same thing, but in the reverse repeat the search, but you can use ‘//° as the left side of a substitute o} . e i m e commanda, 10 mean bl o e 8 e i e # sm o BB e e v € mosSt recent pawem . i1 behave identically in one inserting, ing on the fly. To go backwards and change a line, say at the last line entered. For example, you can say ... botch ... (minor error) s/botch/correct/ (fix botched line) d Y75/ /good/ st usc |5 [~ tnc . hand side of a substitute to stand o o . more text ... e B <dn points ... text ... s//good/p vou dot This is exactly what you want for typing and edit- . ed prints line with ‘horrible thing’ ... em and a /horrible thing/ Course, ¢, respect — when you stop appending, changing or Not only can you O in inserts text before the current line. direction. I the direction the current line, ¢ changes the current line, and i 29 ot is When line ‘$’ gets deleted, however, dot points searched £ difference at the line that followed the last deleted line. // . only 0 tor whate without specifving any line number for the sub- 3-45 “"Advanced Editing on UNIX stitute command or for the second append com- dot pointed at the second line, then the result would be to change and print only the first line, Or you can say mand. and that 1s where dot would be set. a ..o text ... ... horrible botch ... c Semicolon *:’ (major error) Searches with ‘/.../" and *?2...7" start at the (replace entire line) ... fixed up line ... current line and move forward get back to the current line. You should experiment to determine what or backward respectively until they either find the pattern or not what is wanted. Sometimes this is Suppose, for example, that the buffer contains lines like this: happens if you add no lines with a, cor i. The r command will read a file into the text being edited, either at the end if you give no address, or after the specified line if you do. In either case, dot points at the last line read in. ab Remember that you can even say Ur to read a file in at the beginning of the text. (You can also say 0a or 1i to start adding text at the beginning.) The w command writes out the entire file. If you precede the command by one line number, that line is written, while if you precede it by two line numbers, that range of lines is written. The w command does nor change dot: the current line remains the same, regardless of what lines are written. This is true even if you say something like bc Starting at line 1, one would expect that the command /a/,/b/p prints all the lines from inclusive. Actually this the 1s ‘ab’ not to the ‘bc’ what happens. Both searches (for ‘a’ and for *b’) start from the /\.AB/.,/"\.AE/w abstract same point, and thus they both find the line that which involves a context search. contains ‘ab’. Since the w command is sO easy (0 use, The result is to print a single line. Worse, if there had been a line with a ‘b’ in it you should save what you are editing regularly as before the you go along just in case the system crashes, or would be in error, since the second line number ‘ab’ line, then the print command in case you do something foolish, like clobbering would be less than the first, and it is illegal to try what you're editing. to print lines in reverse order. The least intuitive behavior, in a sense, is that of the s command. The rule is simple — line numbers doesn’t set dot as each address is This is because the comma separator for you are left sitting on the last line that got If there were no changes, then dot is processed; changed. place. unchanged. like comma, with the single difference that use To illustrate, suppose that there are three lines in the buffer, and you are sitting on the middle one: as from the same the line numbers are being evaluated. In Thus in our example above, the command /a/./blp x3 prints range of lines from ‘ab’ to ‘bc’, line, and then ‘b’ is searched for, starting beyond -, +s/x/y/p third the because after the ‘a’ is found, dot is set to that Then the command changed. starts effect, the semicolon ‘moves’ dot. x2 the search of a semicolon forces dot to be set at that point x1 prints each In ed, the semicolon *;’ can be used just that line. line, which s the last one But if the three lines had been x 1 This property very simple situation. most often y2 y3 // useful in a Suppose you want to find the second occurrence of ‘thing'. /thing/ and the same command had been issued whil is You could say but this prints the fArst occurrence as well as the 3-46 Advanced Editing on UNIX second, and is a nuisance when you know very anything that could be used in a line search or in well a that it interested in. is only the second one you're substitute command; exactly the same rules and limitations apply. The solution is to say As another example, then, /thing/;// g/"\./p This says to find the first occurrence of ‘thing’, set dot to that line, then find the second and print only that. prints all the formatting commands in a file (lines that begin with *."). Closely related is searching for the second previous occurrence of something, as in The v command is identical to g, except that it operates on those line that do nor contain an occurrence of the pattern. ?’something?;?? Printing the third or fourth or ... in either direc- tion is left as an exercise. (Don't look too hard for mnemonic significance to the letter ‘v’.) So v/"\./p Finally, bear in mind that if you want to find the first occurrence of something in a file, prints all the lines that don't begin with ‘. — the starting at an arbitrary place within the file, it is actual text lines. not sufficient to say The command that follows g or v can be anything: 1;/thing/ because this fails if ‘thing’ occurs on line 1. g/"\./d But it is possible to say deletes all lines that begin with *.", and 0./thing/ g/"S/d (one of the few places where 0 is a legal line number), for this starts the search at line 1. deietes all empty lines. Probably Interrupting the Editor As a final note on what dot gets set to, you should be aware that if you hit the interrupt or delete or rubout or break key while ed is doing a command, things are put back together again and was before the command began. Naturally, some changes are irrevocable — if you are reading or writing a file or making substitutions or deleting clean lines, but these will unpredictable be stopped state in in the some middle (which is why it is not usually wise to stop them). Dot may or may not be changed. Printing i1s more clear cut. changed until the printing is done. Dot is not Thus if you print until you see an interesting line, then hit delete, you are nor sitting on that line or even near it. Dot is left where it was when the p com- mand was started. 4. most useful command that for this can be used to make a change and print each affected line for verification. we could change the word For example, ‘Unix’ to ‘UNIX’ everywhere, and verify that it really worked, with g/Unix/s//UNIX/gp your state i1s restored as much as possible to what it the can follow a global is the substitute command, Notice that we used ‘//’ in the substitute command to mean case, ‘Unix’. ‘the previous pattern’, in this The p command is done on every line that matches the pattern, not just those on which a substitution took place. The global command operates by making two passes over the file. On the first pass, all lines that match the pattern are marked. On the second pass, each marked line in turn is examined, dot is set to that line, and the command executed. This means that it is possible for the command that follows a ¢ or v to use addresses, set dot, and so on, quite freely. + g/ "\.PP/ GLOBAL COMMANDS The global commands g and v are used to prints the line that foliows each *.PP’ command perform one or more editing commands on all (the signal for a new paragraph in some format- lines that either contain (g) or don’t contain (v) ting packages). a specified pattern. line past dot’. As the simplest example, the command g/ UNIX/p prints all lines Remember that ‘4’ means ‘one And g/topic/?"\.SH?! searches for each line that contains “topic’, scans that contain the word "UNIX", The pattern that goes between the slashes can be yackwards until it finds a line that begins *.SH’ (a section heading) and prints the line that folA L PRS2 HE AW E LR - < lows that. thus showing the section headings under which ‘topic’ is mentioned. Finally, 5. Advanced Editing on UNIX 3-47 CUT UNIX COM- which non- AND PASTE WITH MANDS g/"\.EQ/+./"\.EN/=p prints all the lines that lie between lines beginning with *.EQ’ and *.EN’ formatting commands. The g and v commands can also be pre- ceded by line numbers, in which case the lines searched are only those in the range specified. Multi-line Global Commands One editing area in programmers seem not very confident is in what might be called ‘cut and paste’ operations — changing the name of a file, making a copy of a file somewhere else, moving a few lines from one place to another in a file, inserting one file in the middle of another, splitting a file into pieces, and splicing two or more files together. It is possible to do more than one com- Yet most of these operations are actually mand under the control of a global command, quite easy, if vou keep vour wits about you and although the syntax for expressing the operation go cautiously. is about cut and paste. not especially natural or pleasant. As an The next several sections talk We will begin with the UNIX example, suppose the task is to change ‘x’ to 'y’ commands for moving entire files around, then and ‘a’ to ‘b’ on all lines that contain ‘thing’. discuss ed commands for operating on pieces of Then files. g/thing/s/x/y/\ Changing the Name of a File s/a/b/ is sufficient. The *\’ signals the g command that the set of commands continues on the next line; it terminates on the first line that does not end with *\'. (As a minor blemish, you can’t use a substitute command to insert a newline within a g command.) You should watch out for this problem: the command one name to another, like this: mv memo mv does nor work as you expect. The remembered pattern is the last pattern that was actually exe- cuted, so sometimes it will be ‘x’ (as expected), and sometimes it will be ‘a’ (not expected). You must spell it out, like this: paper oldname newname Warning: if there is already a file around with the new name, its present contents will be silently clobbered by the information from the other file. The one exception is that you can’t move a file to itself — g/x/s/x/y/\ mv X X s/a/b/ is illegal. It is also possible to execute 2, ¢ and i commands under a global command; as with other multi-line constructions, all that is needed is to add a '\’ at the end of each line except the Thus to add a ‘.nf° and ‘.sp’ command before each *.EQ’ line, type Making a Copy of a File Sometimes what you want is a copy of a file — an entirely fresh version. This might be because you want to work on a file, and yet save a copy in case something gets fouled up, or just because you're paranoid. g/ "\ .EQ/i\ 0f\ In any case. the way to do it is with the cp command. SP There is no need for a final line containing a *.’ the i command, unless there are further commands being done under the global. On the other hand, it does no harm to put it in etther. The UNIX program that renames files is called mv (for ‘move’); it ‘moves’ the file from the new name. s/a/b/ to terminate How is it done? That's all there is to it: mv from the old name to g/x/slly/\ last. You have a file named ‘memo’ and you want it to be called ‘paper’ instead. (cp stands for ‘copy'’, the system is big on short command names, which are appreciated by heavy users, but sometimes a strain for novices.) Suppose you have a file called ‘good’ and you want to save a copy before you make some dramatic editing changes. Choose a name — ‘savegood’ might be acceptable — then type cp good savegood This copies ‘good’ onto ‘savegood’, and vou now 3-48 Advanced Editing on UNIX have two identical copies of the file ‘good’. (If ‘savegood’ it previously contained something, where you want the output to go. gets overwritten.) Now if you decide at some time that you want to get back to the original state of ‘good’, you can say mv (if savegood you're not cat filel good putting something in ‘savegood’ any (As with c¢p and mv, you're into and anything ability to ‘capture’ the output of a Fortunately it’s not limited to the cat program - you can use it with anv program that good prints on your terminal. if you still want to retain a safe copy. mv just makes a duplicate copy. renames a Both We'll see some more uses for it in a moment. file; cp of them clobber Naturally, you do it. you can combine several files, not just two: the ‘target’ file if it already exists. so vou had better be sure that's what you want to do hefore cat filel fle2 fle3 > bigfile collects a4 whole bunch. Question: is there any difference between Removing a File If you decide you are really done with a file forever, you can remove it with the rm com- cp good savegood and mand: catl good >savegood rm throws savegood away Answer: for most purposes, no. (irrevocably) the file called ‘savegood’. case, since cat is obviously all you need. Putting Two or More Files Together well, The next step is the familiar one of collecting two or more files into one big one. This will be needed, for exampie, when the author of a paper decides that combined into one. several sections need to be which you can investigate for yourself by reading the manual. For now we’ll stick to sim- ple usages. Adding Something to the Efid of a File There are several ways to do Sometimes you want to add one file to the end of another. 1S a program called cat. now that you can do it; (Not a// programs have two-letter names.) cat is short for ‘concatenate’. which is exactly what we want to do. Suppose the job is to combine the files ‘filel” and ‘file2’ into a single file called “bigfile". If vou say filel in fact before reading and/or cat to add the file ‘good!’ to the end of the file ‘good’? You could try the contents of ‘file’ will get printed on vour ter- cat We have enough building blocks further it would be valuable if you figured out how. To be specific, how would you use cp, myv file If you say The cp will do some other things as it, of which the cleanest, once vou get used to it, cat You might rea- sonably ask why there are two programs in that answer 1s that minal. ‘bigfile’, program is one of the most useful aspects of the system. In summary, >bighle and the job is done. This interested savegood file2 that was already there is destroyed.) more), or cp Then you can say cat good goodl mv temp good >temp which is probably most direct. You should also understand why file2 cat good ‘file2’ will boih be printed on your terminal, in doesn't work. that order. ‘good’!) the contents of ‘filel’ and then the contents of So cat combines the files, all right, but it’'s not much help to print them on the terminal — we want them in ‘bigfile’. Fortunately, there is a way. You can tell b ) O [@" g v (Don't practice use a with a good variant of >. In fact, > > is identical to > except that instead of clobbering the old file, it simply tacks stuff on at the end. cat joTM) >good The easy way is to called > >. the system that instead of printing on vour ter- goodl R e dnda goodl PR goodl > >good Thus vou could sav Advanced Editing on UNIX if ‘good’ didn't exist, this ‘goodl” called ‘good’.) makes a copy of Inserting One File into Another Suppose you area where new users seem unsure of themselves. the ed commands for reading and writing files. Of course vou can't go very far without knowing r and w. Equally useful, but less well known, is the ‘edit’ command e. Within ed. the command had quit with the q command, then re- entered ed with a new file name, except that if you have a pattern remembered, then a command like // will still work. If you enter ed with the command /Table 1/ Table | shows that ... [response from ed/ The critical line is the last one. As we said ear- lier. the r command reads a file; here you asked for it to be read in right after line dot. An r command without any address adds lines at the end. so it is the same as Sr. Writing out Part of a File The other side of the coin is writing out part of the document you're editing. For exam- can be formatted and tested separately. Suppose that in the file being edited we have .13 ...[lots of stuff] filel .TE ... (editing) ... W (writes back in filel) (edit new file, without leaving editor) ... (editing on file2) ... W (writes back on file2) (As an aside, if you examine the sequence of commands here, you can see why many UNIX systems use e as a synonym for ed.) can gram. To isolate the table in a separate file * TS’ line), then write out the interesting part: without ever leaving ed and without typing the name of any file more than once. which is the way a table is set up for the tbl pro- called ‘table’, first find the start of the table (the (and so on) does a series of edits on various files find out the remembered file name at any time with the f command: just type f without a file name. You can also change the name of the remembered file name with f; a useful sequence is ed table file that table from the previous example, so it ed remembers the name of the file, and any subsequent e, r or w commands that don’t contain a filename will refer to this remembered fle. Thus You memo ed ple, maybe you want to split out into a separate file e file2 Edit ‘memo’, find ‘Table ', and add the file ‘table’ right there: newfile ed Now what? This one is easy. . says ‘| want to edit a new file called newfile. without leaving the editor." The e command discards whatever you're currently working on and starts over on newfile. It's exactly the same as if ed and the data contained in ‘table’ has to go there, probably so it will be formatted properly by nroff The first step is to ensure that you know you That is, in Table 1 shows that ... or troff. Filenames e ‘memo’, ‘memo’ somewhere is a line that says of files — individual lines or groups of lines. another file called just after the reference to Table 1. Now we move on to manipulating pieces is have a and vou want the file called ‘table’ to be inserted 6. CUT AND PASTE WITH THE EDITOR This 3-49 precious /°\.TS/ TS [ed prints the line it found]/ «/"\.TE/w table and the job is done. If vou are confident, you can do it all at once with /\.TS/;/"\.TE/w table The point is that the w command can write out a group of lines, instead of the whole file. In fact, you can write out a single line if you like; just give one line number instead of two. For example, if vou have just typed a horribly com- f junk plicated line and you know that it (or something ... (editing) ... like it) is going to be needed later, then save it which gets a copy of a precious file, then uses f to guarantee that a careless w command won't clobber the original. — don’t re-type it. In the editor, say 3-50 Advanced Editing on UNIX a As another example of a frequent opera- ...lots of stuff... tion, you can reverse the order of two adjacent lines by moving the first one to after the second. ...horrible line... Suppose W that you are positioned at the first. Then temp a m + «..more stuff... does it. [t says to move line dot to after one line . temp after a second line, line dot. If you are positioned on the ...more stuff... m This last example is worth studying, to be sure you appreciate what's going on. want to move a paragraph from its present position in a paper to the end. How would you do it? As a concrete example, suppose each paragraph in the paper begins with the formatting command *.PP’. Think about it and write down the details before reading on. The brute force way (not necessarily bad) is to write the paragraph onto a temporary file, delete it from its current position, then read in the temporary file at the end. Assuming that you are sitting on the *. PP’ command that begins the paragraph, this is the sequence of commands: ‘temp’. Then delete the same lines. Finally, read ‘temp’ at the end. As we said, that's the brute force way. The easier way (often) is to use the /move comm that ed provides — it lets you do the whole set of operations at one crack, without any temporary file. The m command is commands in takes numbers in front that tell what lines are to be affected. that it like many other up to twe ed line It is also followed by a line number that tells where the lines are to go. Thus to ‘line2’ target, you move to take properly, or you lines you botched thought m you command care that may well did. can The be a you specify not move the result ghastly of 2 mess. Doing the job a step at a time makes it easier for you to verify at each step that you accomplished what you wanted to. It’s also a good idea to issue a w command before doing anything complicated; then if you goof, it's easy to back up to after all the ‘line3’. lines between Naturally, any ‘linel’ of and ‘linel’ other ways to specify lines. Suppose again that first line of the paragraph. ed provides a facility for marking a line with a particular name so you can later reference it by name regardless of its actual line number., This can be handy for moving lines. and for keeping track of them as they move. The mark command is k: the command Kx marks the current line with the name ‘x'. I[f a line number precedes the k. that line is marked. (The mark name must be a single lower case letter.) Now you can refer to the marked line with the address 4 X around. etc., can be patterns between slashes, S signs. or you're sitting at Then vou can sav the Find the first line of the block to be moved, and mark it with @. line and mark it with Now position yourself 4. Then find the last at the place where the stuff is to go and say ‘a,’bm. Bear in mind that only one line can have i ../ \.PP/ —m$ That's all. have them Marks are mos¢ useful for moving things linel, line2 m line3 says to specify both the lines you are moving and the Marks That is, from where you are now (*.’) until one line before the next ‘.PP’ (*/"\.PP/-") write mand This is a matter of personal taste — do what you have most confidence in. The main difficulty with the m command is that if you use patterns —d $r temp onto does the interchange. where you were. o/ "\ PP/ —w temp o/ e As you can see, the m command is more succinct and direct than writing, deleting and rereading. When is brute force better anyway? Moving Lines Around Suppose you o given time. Advanced Editing on UNIX Copying Lines mation on each can be found in [3]. We mentioned earlier the idea of saving a Grep line that was hard to type or used often, so as to cut down on typing time. Of course this could be then more than one line; the saving provides command, one or more lines at any point. This is often mand, except that instead of moving lines it simat the place you named. Thus duplicates the entire contents that you are edit- A more common use for tis for creating a of lines that differ only slightly. For example, you can say interest, but if there are many files this can get very tedious, and if the files are really big, it may be impossible because of limits in ed. The program ...... (long line) these that have described we was invented The search the paper are often limitations. in to get patterns g/re/p That describes exactly what grep does — it prints every line in a set of files that contains a particular pattern. Thus ‘thing’ flel file2 filel finds ‘thing’ wherever it occurs in any of the files ‘filel’, ‘file2’, etc. to (make a copy) s/x/yl t. (change it a bit) (make third copy) s/ylz/ (change it a bit) grep also indicates the file in which the line was found, so you can later edit it if you like. The pattern represented by ‘thing' can be any pattern you can use in the editor, since grep and so on. and ed use exactly the same mechanism for pat- tern searching. The Temporary Escape *" temporarily escape from the editor to do some other UNIX command, perhaps one of the file copy or move commands discussed in section §, the without leaving mand ! provides a way to do this. editor. The ‘escape’ com- non-alphabetic characters, since many such char- acters also mean something special to the UNIX command interpreter (the ‘shell’). interpret them before grep gets a chance. There is also a way to find lines that don grep tany UNIX command your current editing state i1s suspended, and the UNIX command you asked for is executed. finishes, ed will signal When you by printing another !, at that point you can resume editing. can If you don't quote them, the command interpreter will try to contain a pattern: If you say command It is wisest always to enclose the pattern in the single quotes '..." if it contains any Sometimes it is convenient to be able to really including another ed. do amnv uNIX command, —v 'thing’ filel file2 finds all lines that don't contains ‘thing’. -v must occur in the position shown. The Given grep and grep —v, it is possible to do things like selecting all lines that contain some combination of patterns. For example, to get all lines that contain ‘x’ but not ‘y': grep x file... | grep (This is quite common, in fact.) In this case, you can even do another ' 7. grep around grep .......... X You It may be possible to edit each file separately and look for the pattern of a the all called ‘regular expressions’, and ‘grep’ stands for 1,3t ing. find t called The t command is identical to the m com- series to files, to edit them or perhaps just to verify their another them want presence or absence. easier than writing and reading. duplicates you occurrences of some word or pattern in a set of (for ‘transfer’) for making a copy of a group of ply Sometimes is presumably even greater. ed 3-51 —v y (The notation | is a ‘pipe’, which causes the output of the first command to be used as input to SUPPORTING TOOLS the second command; see (2].) There are several tools and techniques that go along with the editor, all of which are relatively easy once you know how ed works, because they are all based on the editor. In this section we will give some fairly cursorv examples of these tools, more to indicate their existence than to provide a complete tutorial. More infor- Editing Scripts If a fairly complicated set of editing operations is to be done on a whole set of files, the 2asiest thing to do 1s to make up 4 “script’, i.e.. a file that contains the operations vou want (0 perol o 5 O torm, then apply this s ~ (0 each Nie n wurn. { 3-52 Advanced Editing on UNIX For example, suppose you want to change References ‘Unix’ [1] every ‘GCOS’ to in a 'UNIX" large and every ‘Gceos’' to number of files. Then put into the file *script’ the lines g/Unix/s//UNIX/g (2] W [3] The <script ed file2 <script This causes ed to take its commands from the prepared script. Nottce that the whole job has to be planned in advance. And of course by using the UNIX command interpreter, vou can cvcle through a set of files automatically, with varying degrees of ease. Sed sed (‘stream editor’) editor with restricted capable of processing is a version capabilities but unlimited of the which is amounts of Basically sed copies its input to its output, applving one or more editing commands to each line of input. As an example, suppose that we want to the given ‘Unix’ to ‘UNIX" above, but without part of the example rewriting the files. Then the command sed applies 's/Unix/UNIX/g" the command flel Hfle2 ‘s/Unix/UNIX/g' to all lines from “filel”, ‘*fle2’, etc., and copies all lines to the output. The advantage of using sed in such a case is that it can be used with input too large for ed to handle. lected in All the output can be col- one place, either in 4 file or perhaps piped into another program. If the editing transformation is so complicated that more than one editing command is needed, commands can be supplied from a file, or on Ken L. Thompson and Dennis M. Ritchie, the command complex svntax. line, with a slightlv more To take commands from a file, for example, sed UNIYX Laboratories. Now vou can say do Brian W. Kernighan, UNIX For Beginners. Bell Laboratories internal memorandum. q ed filel —f cmdfile Bell Laboratories internal memorandum. g/Geos/s//GCOS/g input. Brian W. Kernighan, 4 Tworial Introduction 1o the UNIX Text Editor. input—Afiles... sed has further capabilities, including conditional testing and branching, which we cannot g0 Into here. Acknowledgement [ am grateful to Ted Dolotta for his careful reading and valuable suggestons. Progerammer's Manual. Bell An Introduction to Display Editing with Vi 3-53 An Introduction to Display Editing with Vi William Joy Revised for versions 3.5/2.13 by Mark Horton Computer Science Division Department of Electrical Engineering and Computer Science University of California, Berkeley Berkeley, Ca. 94720 1. Getting started This document provides a quick introduction to vi. (Pronounced vee-eve.) You should be running vi on a file you are familiar with while you are reading this. The first part of this document (sections 1 through 5) describes the basics of using vi. Some topics of special interest are presented in section 6, and some nitty-gritty details of how the editor functions are saved for section 7 to avoid cluttering the presentation here. There is also a short appendix here, which gives for each character the special meanings which this character has in vi. Attached to this document should be a quick reference card. This card summarizes the commands of viin a very compact format. handy while you are learning vi. 1.1. You should have the card Specifying terminal type Before you can start vi you must tell the system what kind of terminal you are using. Here is a (necessarily incomplete) list of terminal type codes. If your terminal does not appear here, you should consult with one of the staff members on your system to find out the code for your terminal. If your terminal does not have a code, one can be assigned and a description for the terminal can be created. Code Full name Type 2621 2645 actd act$ Hewlett-Packard 2621 A/P Hewlett-Packard 264x Microterm ACT-IV Microterm ACT-V Intelligent Intelligent Dumb Dumb adm3a Lear Siegler ADM-3a Dumb adm3l Lear Siegler ADM-31 Intelligent c100 Human Design Concept 100 Intelligent dm1520 Datamedia 1520 Dumb dm2500 Datamedia 2500 Intelligent dm3025 Datamedia 3025 Intelligent fox h1500 hl9 Perkin-Elmer Fox Hazeltine 1500 Heathkit h19 Intelligent 1100 Infoton 100 Intelligent mime Imitating a smart act4 Intelligent Dumb Intelligent The financial support of an 1BM Graduate Fellowship and the National Science Foundation under grants MCS74-07644-A03 and MCS78-07291 is gratefully acknowledged. 3-54 An Introduction to Display Editing with Vi t1061 - vtS2 Teleray 1061 Intelligent Dec VT-52 Dumb Suppose for example that you have a Hewlett-Packard HP2621 A terminal. The code used by the system for this terminal is ‘2621’. In this case you can use one of the following commands to tell the system the type of your terminal: % setenv TERM 2621 This command works with the shell ¢sh on both version 6 and 7 systems. If you are using the standard version 7 shell then you should give the commands $ TERM=2621 $ export TERM If you want to arrange to have your terminal type set up automatically when you log in, you can use the rser program. If you dial in on a mime, but often use hardwired ports, a typical line for your .login file (if you use csh) would be seteny TERM 'tset — —d mime’ or for your .profile file (if you use sh) TERM='tset — —d mime' Tset knows which terminals are hardwired to each port and needs only to be told that when you Tset is usually used to change the erase and kill characters, dial in you are probably on a mime. t00. 1.2. Editing a file After telling the system which kind of terminal you have, you should make a copy of a file you are familiar with, and run v/ on this file, giving the command % vi name replacing name with the name of the copy file you just created. The screen should clear and the text of your file should appear on the screen. If something else happens refer to the footnote.t 1.3. The editor’s copy: the buffer The editor does not directly modify the file which you are editing. Rather, the editor You do not affect the contents of the file unless and until you write the changes you make back into the makes a copy of this file, in a place called the buffer, and remembers the file’s name. original file. $ If you gave the systern an incorrect terminal type code then the editor may have just made a mess out of your screen, This happens when it sends control codes for one kind of terminal to some other kind of terminal. In this case hit the keys :q (colon and the q key) and then hit the RETURN key. to the command level interpreter. This should get you back Figure out what you did wrong (ask someone else if necessary) and try again. Another thing which can go wrong is that you typed the wrong file name and the editor just printed an error diagnostic. In this case you should foilow the above procedure for getting out of the editor, and try again this time spelling the file name correctly. If the editor doesn’t seem to respond to the commands which you type here. try sending an interrupt to it by hitting the DEL or RUB key on your terminal, and then hitting the :g command again followed by a carriage wafaq An Introduction to Display Editing with Vi 3-55 1.4. Notational conventions In our examples, input which must be typed as is will be presented in bold face. Text which should be replaced with appropriate input will be given in /ialics. We will represent special characters in SMALL CAPITALS. 1.5. Arrow keys The editor command set is independent of the terminal you are using. On most terminals with cursor positioning keys, these keys will also work within the editor. If you don’t have cursor positioning keys, or even if you do, you can use the h j k and |1 keys as cursor positioning keys (these are labelled with arrows on an adm3a).* (Particular note for the HP2621: on this terminal the function keys must be shifted (ick) to send to the machine, otherwise they only act locally. Unshifted use will leave the cursor positioned incorrectly.) 1.6. Special characters: ESC, CR and DEL Several of these special characters are very important, so be sure to find them right now. Look on your keyboard for a key labelled ESC or ALT. It should be near the upper left corner of your terminal. Try hitting this key a few times. The editor will ring the bell to indicate that it is in a quiescent state.} Partially formed commands are cancelled by ESC, and when you insert text in the file you end the text insertion with ESC. This key is a fairly harmless one to hit, so you can just hit it if you don’t know what is going on until the editor rings the bell. The CR or RETURN key is important because it is used to terminate certain commands. It is usually at the right side of the keyboard, and is the same command used at the end of each shell command. Another very useful key is the DEL or RUB key, which generates an interrupt, telling the editor to stop what it is doing. It is a forceful way of making the editor listen to you, or to return it to the quiescent state if you don’t know or don’t like what is going on. Try hitting the ‘/* key on your terminal. This key is used when you want to specify a string to be searched for. The cursor should now be positioned at the bottom line of the terminal after a */’ printed as a prompt. You can get the cursor back to the current position by hitting the DEL or RUB key; try this now.®* From now on we will simply refer to hitting the DEL or RUB key as ‘‘sending an interrupt.”’** The editor often echoes your commands on the last line of the terminal. If the cursor is on the first position of this last line, then the editor is performing a computation, such as computing a new position in the file after a search or running a command to reformat part of the buffer. When this is happening you can stop the editor by sending an interrupt. 1.7. Getting out of the editor After you have worked with this introduction for a while, and you wish to do something else, you can give the command ZZ to the editor. This will write the contents of the editor’s buffer back into the file you are editing, if you made any changes, and then quit from the editor. You can also end an editor session by giving the command :q!CR;t this is a dangerous but occasionally essential command which ends the editor session and discards all your changes. You need to know about this command in case you change the editor’s copy of a file you wish * As we will see later, # moves back to the left (like control-h which is a backspace), ; moves down (in the same column), k moves up (in the same column), and / moves to the right. $ On smart terminals where it is possible, the editor will quietly flash the screen rather than ringing the bell. * Backspacing over the ‘/° will also cancel the search. ** On some systems, this interruptibility comes at a price: you cannot type ahead when the editor is computing with the cursor on the bottom line. + All commands which read from the last display line can also be terminated with a EsC as well as an CR. 3-56 An Introduction to Display Editing with Vi only to look at. Be very careful not to give this command when you really want to save the changes you have made. 2. Moving around in the file 2.1. Scrolling and paging The editor has a number of commands for moving around in the file. The most useful of these is generated by hitting the control and D keys at the same time, a controi-D or *"D’. will use this two character notation for referring to these control keys from now on. have a key labelled ‘*’ on your terminal. We You may This key will be represented as ‘1’ in this document; **” 1s exclusively used as part of the ‘"x’ notation for control characters.$ As you know now if you tried hitting "D, this command scrolls down in the file. thus stands for down. to remember. The D Many editor commands are mnemonic and this makes them much easier For instance the command to scroll up is "U. Many dumb terminals can’t scroll up at all, in which case hitting “U clears the screen and refreshes it with a line which is farther back in the file at the top. If you want to see more of the file below where you are, you can hit "E to expose one #* The command Y (which is hopelessly non-mnemonic, but next to “U on the keyboard) exposes one more line at more line at the bottom of the screen, leaving the cursor where it is. the top of the screen. There are other ways to move around in the file; the keys "F and “B t move forward and backward a page, keeping a couple of lines of continuity between screens so that it is possible to read through a file using these rather than "D and “U if you wish. Notice the difference between scrolling and paging. If you are trying to read the text in a file, hitting “F to move forward a page will leave you only a little context to look back at. Scrolling on the other hand leaves more context, and happens more smoothly. You can con- tinue to read the text as scrolling is taking place. 2.2. Searching, goto, and previous context Another way to position yourself in the file is by giving the editor a string to search for. Type the character / followed by a string of characters terminated by cR. tion the cursor at the next occurrence of this string. The editor will posi- Try hitting n to then go to the next occurrence of this string. The character ? will search backwards from where you are, and is otherwise like /.t If the search string you give the editor is not present in the file the editor will print a diagnostic on the last line of the screen, and the cursor will be returned to its initial position. If you wish the search to match only at the beginning of a line, begin the search string with an . To match only at the end of a line, end the search string with a §. Thus /!searchCr will search for the word ‘search’ at the beginning of a line, and /lastSCR searches for the word ‘last’ at the end of a line.* t If you don’t have a ‘TM" key on your termin characters are one and the same. t3 Version 3 only. $ Not avaiiable in all v2 editors due to memory constraints. t These searches will normally wrap around the end of the file, and thus find the string even if it is not on a line in the direction you search provided it is anywhere eise in the file. You can disable this wraparound in scans by giving the command :se nowrapscancCr, or more briefly :se nowscCR. *Actually. the string you give to search for here can be a reguiar expression in the sense of the editors ex(1) and ed(1). PR 3¢ A EER aau s o If you don’t wish to learn about this yet, you can disable this more generai facility by doing TR BUIB4AEILUR, bnas VY masffisncw DUl OPlmese (NS emows cemaced COINMAand effect (more about EYINIT later.) ten Lo i LAl ST NIT in your environment, you can have this always be in An Introduction to Display Editing with Vi 3-57 The command G, when preceded by a number will position the cursor at that line in the file. Thus 1G will move the cursor to the first line of the file. If you give G no count, then it moves to the end of the file. If you are near the end of the file, and the last line is not at the bottom of the screen, the editor will place only the character ‘=’ on each remaining line. This indicates that the last line in the file is on the screen: that is, the ‘~° lines are past the end of the file. You can find out the state of the file you are editing by typing a "G. The editor will show you the name of the file you are editing, the number of the current line, the number of lines in the buffer, and the percentage of the way through the buffer which you are. Try doing this now, and remember the number of the line you are on. Give a G command to get to the end and then another G command to get back where you were. You can also get back to a previous position by using the command TM (two back quotes). This is often more convenient than G because it requires no advance preparation. Try giving a G or a search with / or ? and then a TM to get back to where you were. If you accidentally hit n or any command which moves you far away from a context of interest, you can quickly get back by hitting . 2.3. Moving around on the screen Now try just moving the cursor around on the screen. If your terminal has arrow keys (4 or 5 keys with arrows going in each direction) try them and convince yourself that they work. (On certain terminals using v2 editors, they won’t.) If you don’t have working arrow keys, ycu can always use h, j, k, and . Experienced users of vi prefer these keys to arrow keys, because they are usually right underneath their fingers. Hit the + key. Each time you do, notice that the cursor advances to the next line in the file, at the first non-white position on the line. The — key is like + but goes the other way. These are very common keys for moving up and down lines in the file. Notice that if you go off the bottom or top with these keys then the screen will scroll down (and up if possible) to bring a line at a time into view. The RETURN key has the same effect as the + key. Vi also has commands to take you to the top, middle and bottom of the screen. H will take you to the top (home) line on the screen. Try preceding it with a number as in 3H. This will take you to the third line on the screen. Many vi commands take preceding numbers and do interesting things with them. Try M, which takes you to the middle line on the screen, and L, which takes you to the last line on the screen. L also takes counts, thus 5L will take you to the fifth line from the bottom. 2.4. Moving within a line Now try picking a word on some line on the screen, not the first word on the line. move the cursor using RETURN and = to be on the line where the word is. Try hitting the w key. This will advance the cursor to the next word on the line. Try hitting the b key to back up words in the line. Also try the e key which advances you to the end of the current word rather than to the beginning of the next word. Also try SPACE (the space bar) which moves right one character and the BS (backspace or “H) key which moves left one character. The key h works as “H does and is useful if you don’t have a BS key. (Also, as noted just above, | will move to the right.) If the line had punctuation in it you may have noticed that that the w and b keys stopped at each group of punctuation. You can also go back and forwards words without stopping at punctuation by using W and B rather than the lower case equivalents. Think of these as bigger words. Try these on a few lines with punctuation to see how they differ from the lower case w and b. The word keys wrap around the end of line, rather than stopping at the end. Try moving to a word on a line below where you are by repeatedly hitting w. 3-58 An Introduction to Display Editing with Vi 2.5. Summary -] b )] -1 FIPTEOZLMAOW S | 4 =LVT IcE: SPACE » » forward to next page tell what is going on next line, same column backspace the cursor ] previous line, same column scrolls up in the file exposes another line at the top (v3) ) 3 2.6. advance the cursor one position backwards to previous page scrolls down in the file exposes another line at the bottom (v3) next line, at the beginning previous line, at the beginning scan for a following string forwards scan backwards back a word, ignoring punctuation go to specified line, last default home screen line middle screen line last screen line forward a word, ignoring punctuation back a word end of current word scan for next instance of / or ? pattern word after this word View ¢ If you want to use the editor to look at a file, rather than to make changes, invoke it as This will set the readonly option which will prevent you from accidently view instead of vi. overwriting the file. 3. Making simple changes 3.1. Inserting One of the most useful commands is the i (insert) command. thing you type until you hit ESC is inserted into the file. After you type i, every- Try this now:; position yourself to some word in the file and try inserting text before this word. If you are on an dumb terminal it will seem, for a minute, that some of the characters in your line have been overwritten, but they will reappear when you hit ESC. Now try finding a word which can, but does not, end in an ‘s’. Position yourself at this word and type e (move to end of word), then a for append and then ‘sgsC’ to terminate the textual insert. This sequence of commands can be used to easily pluralize a word. ok e Rt oy T S e B o B0 L P ¥ 4 = g e =1 Try inserting and appending a few times to make sure you understand how this works; i placing text to the left of the cursor, a to the right. It is often the case that you want to add new lines to the file you are editing, before or after some specific line in the file. Find a line where this makes sense and then give the com- mand o to create a new line after the line you are on, or the command O to create a new line before the line you are on. After you create a new line in this way, text you type up to an ESC ¥ Not available in all v2 editors due to memory constraints. | An Introduction to Display Editing with Vi 3-59 is inserted on the new line. Many related editor commands are invoked by the same letter key and differ only in that one is given by a lower case key and the other is given by an upper case key. In these cases, the upper case key often differs from the lower case key in its sense of direction, with the upper case key working backward and/or up, while the lower case key moves forward and/or down. Whenever you are typing in text, you can give many lines of input or just a few characters. To type in more than cne line of text, hit a RETURN at the middle of your input. A new line will be created for text, and you can continue to type. If you are on a slow and dumb terminal the editor may choose to wait to redraw the tail of the screen, and will let you type over the existing screen lines. This avoids the lengthy delay which would occur if the editor attempted to keep the tail of the screen always up to date. The tail of the screen will be fixed up, and the missing lines will reappear, when you hit ESC. While you are inserting new text, you can use the characters you normally use at the system command level (usually "H or #) to backspace over the last character which you typed, and the character which you use to kill input lines (usually @, "X, or "U) to erase the input you have typed on the current line.t The character "W will erase a whole word and leave you after the space after the previous word; it is useful for quickly backing up in an insert. Notice that when you backspace during an insertion the characters you backspace over are not erased:; the cursor moves backwards, and the characters remain on the display. This 1s often useful if you are planning to type in something similar. In any case the characters disap- pear when when you hit ESC; if you want to get rid of them immediately, hit an ESC and then a again. Notice also that you can’t erase characters which you didn’t insert, and that you can’t backspace around the end of a line. If you need to back up to the previous line to make a correction, just hit ESC and move the cursor back to the previous line. After making the correction you can return to where you were and use the insert or append command again. 3.2. Making small corrections You can make small corrections in existing text quite easily. which is wrong or just pick Find a single character any character. Use the arrow keys to find the character, or get near the character with the word motion keys and then either backspace (hit the BS key or "H or even just h) or SPACE (using the space bar) until the cursor is on the character which is wrong. If the character is not needed then hit the x key; this deletes the character from the file. It is analogous to the way you x out characters when you make mistakes on a typewriter (except it’s not as messy). If the character is incorrect, you can replace it with the correct character by giving the command rc, where c¢ is replaced by the correct character. Finally if the character which is incorrect should be replaced by more than one character, give the command s which substitutes a string of characters, ending with EsC, for it. If there are a small number of characters which are wrong you can precede s with a count of the number of characters to be replaced. Counts are also useful with x to specify the number of characters to be deleted. 3.3. More corrections: operators You already know almost enough to make changes at a higher level. All you need to know now is that the d key acts as a delete operator. Try the command dw to delete a word. Try hitting . a few times. Notice that this repeats the effect of the dw. The command . repeats the last command which made a change. You can remember it by analogy with an ellipsis "...". + In fact, the character "H (backspace) always works to erase the last input character here, regardless of wnat your erase character is. 3-60 An Introduction to Display Editing with Vi Now try db. This deletes a word backwards, namely the preceding word. This deletes a single character, and is equivalent to the x command. Try dSPACE. Another very useful operator is ¢ or change. The command cw thus changes the text of a single word. You follow it by the replacement text ending with an Esc. Find a word which you can change to another, and try this now. Notice that the end of the text to be changed was marked with the character ‘S’ so that you can see this as you are typing in the new material. 3.4. Operating on lines It is often the case that you want to operate on lines. Find a line which you want to delete, and type dd, the d operator twice. This will delete the line. If you are on a dumb terminal, the editor may just erase the line on the screen, replacing it with a line with only an @ on it. This line does not correspond to any line in your file, but only acts as a place holder. It helps to avoid a lengthy redraw of the rest of the screen which would be necessary to close up the hole created by the deletion on a terminal without a delete line capability. Try repeating the c operator twice; this will change a whole line, erasing its previous con- tents and replacing them with text you type up to an ESC.t You can delete or change more than one line by preceding the dd or cc with a count, i.e. Sdd deletes 5 lines. You can also give a command like dL to delete all the lines up to and including the last line on the screen, or d3L to delete through the third from the bottom line. Try some commands like this now.* Notice that the editor lets you know when you change a large number of lines so that you can see the extent of the change. The editor will also always tell you when a change you make affects text which you cannot see. 3.5. Undoing Now suppose that the last change which you made was incorrect; you could use the insert, delete and append commands to put the correct material back. However, since it is often the case that we regret a change or make a change incorrectly, the editor provides a u (undo) command to reverse the last change which you made. Try this a few times, and give it twice in a row to notice that an u also undoes a u. The undo command lets you reverse only a single change. After you make a number of changes to a line, you may decide that you would rather have the original state of the line back. The U command restores the current line to the state before you started changing it. You can recover text which you delete, even if undo will not bring it back: see the section on recovering lost text below. 3.6. Summary SPACE advance the cursor one position "H backspace the cursor W erase a word during an insert erase your erase (usually “H or #), erases a character during an insert kill your kill (usually @, "X, or "U), kills the insert on this line @ LA A el ol D b e SR T o h o o AL s e scsmesmh om s opens and inputs new lines, above the current undoes the changes you made to the current line appends text after the cursor changes the object you specify to the following text t The command S is a convenient synonym for for cc, bv analogy with s. Think of S as a substitute on iines, while s is a substitute on characters. * One subtle point here involves using the / search after a d. This will normaily delete characters from the current position to the point of the match. If what is desired is to delete whole lines including the two points, give the pattern as /pat/ +0, a line address. An Introduction to Display Editing with Vi 3-61 Mon‘ deletes the object you specify inserts text before the cursor ; o opens and inputs new lines, below the current undoes the last change 4. Moving about; rearranging and duplicating text 4.1. Low level character motions Now move the cursor to a line where there is a punctuation or a bracketing character such as a parenthesis or a comma or period. Try the command fx where x is this character. This command finds the next x character to the right of the cursor in the current line. Try then hitting a ;, which finds the next instance of the same character. By using the f command and then a sequence of ;’s you can often get to a particular place in a line much faster than with a sequence of word motions or SPACEs. There is also a F command, which is like f, but searches backward. The ; command repeats F also. When you are operating on the text in a line it is often desirable to deal with the charac- ters up to, but not including, the first instance of a character. notice that the x character is deleted. to, i.e. Try dfx for some x now and Undo this with u and then try dtx; the t here stands for delete up to the next x, but not the x. The command T is the reverse of t. When working with the text of a single line, an | moves the cursor to the first non-white position on the line, and a $ moves it to the end of the line. Thus $a will append new text at the end of the current line. Your file may have tab ("1) characters in it. These characters are represented as a number of spaces expanding to a tab stop, where tab stops are every 8 positions.* When the cursor is at a tab, it sits on the last of the several spaces which represent that tab. Try moving the cursor back and forth over tabs so you understand how this works. On rare occasions, your file may have nonprinting characters in it. These characters are displayed in the same way they are represented in this document, that is with a two character code, the first character of which is ‘°°. On the screen non-printing characters resemble a **’ character adjacent to another, but spacing or backspacing over the character will reveal that the two characters are, like the spaces representing a tab character, a single character. The editor sometimes discards control characters, depending on the character and the setYou can get a control ting of the beautify option, if you attempt to insert them in your file. character in the file by beginning an insert and then typing a "V before the control character. The "V quotes the following character, causing it to be inserted directly into the file. 4.2. Higher level text objects In working with a document it is often advantageous to work in terms of sentences, para- graphs, and sections. The operations ( and ) move to the beginning of the previous and next sentences respectively. Thus the command d) will delete the rest of the current sentence; likewise d{ will delete the previous sentence if you are at the beginning of the current sentence, or -S Y ha—m L3 4 o @ £n the current sentence up to where you are 99 % ¢ b-o] if L] YPFBMEE AwsEm ge 4 P Yo 2 e ooV . ? BN yOou are not at {ne oeginning oi tne current sen- tence. which is followed by either the end of a Any number of closing ), ‘], ‘> and *”’ characters may appear after A sentence is defined to end at a *.”, *!” or *? line, or by two spaces. 3 the *.”, ‘¥’ or *?’ before the spaces or end of line. The operations { and } move over paragraphs and the operations [{ and 1] move over sections. * This is settable by a command of the form :se ts=xcr, where x is 4 (0 set tabstops every four columns. This has effect on the screen representation within the editor. t The |l and |l operations require the operation character to be doutled because they can move the cursor far 3-62 An Introduction to Display Editing with Vi A paragraph begins after each empty line, and also at each of a set of paragraph macros. specified by the pairs of characters in the definition of the string valued option paragraphs. The default setting for this option defines the paragraph macros of the —ms and —mm macro packages, i.e. the “.IP’, *LP’, “.PP’ and “.QP’, *.P’ and ‘.LI' macros.+ Each paragraph boundary is also a sentence boundary. The sentence and paragraph commands can be given counts to operate over groups of sentences and paragraphs. Sections in the editor begin after each macro in the sections option, normally . NH’, *.SH", “H’ and *.HU’, and each line with a formfeed "L in the first column. Section boundaries are always line and paragraph boundaries also. Try experimenting with the sentence and paragraph commands until you are sure how If you have a large document, try looking through it using the section commands. they work. The section commands interpret a preceding count as a different window size in which to redraw the screen at the new location, and this window size is the base size for newly drawn windows until another size is specified. This is very useful if you are on a slow terminal and are looking for a particular section. You can give the first section command a small count to then see each successive section heading in a small window. 4.3. Rearranging and duplicating text The editor has a single unnamed buffer where the last deleted or changed away text is saved, and a set of named buffers a—z which you can use to save copies of text and to move text around in your file and between files. The operator y yanks a copy of the object which follows into the unnamed buffer. If preceded by a buffer name, "xy, where x here is replaced by a letter a—z, it places the text in the named buffer. The text can then be put back in the file with the commands p and P; p puts the text after or below the cursor, while P puts the text before or above the cursor. If the text which you yank forms a part of a line, or is an object such as a sentence which partially spans more than one line, then when you put the text back, it will be placed after the cursor (or before if you use P). If the yanked text forms whole lines, they will be put back as whole lines, without changing the current line. command. | In this case, the put acts much like a 0 or O Try the command YP. This makes a copy of the current line and leaves you on this copy, which is placed before the current line. The command Y is a convenient abbreviation for yy. The command Yp will alsg make a copy of the current line, and place it after the current line. You can give Y a count of lines to yank, and thus duplicate several lines; try 3YP. To move text within the buffer, you need to delete it in one place, and put it back in You can precede a delete operation by the name of a buffer in which the text is to be another. stored as in "aSdd deleting 5 lines into the named buffer 2. You can then move the cursor to the eventual resting place of the these lines and do a "ap or "aP to put them back. In fact, you can switch and edit another file before you put the lines back, by giving a command of the form :@ nameCR where name is the name of the other file you want to edit. You will have to write back the contents of the current editor buffer (or discard them) if you have made changes before the editor will let you switch to the other file. An ordinary delete command saves the text in the unnamed buffer, so that an ordinary put can move it elsewhere. However, the unnamed buffer is lost when you change files, so to move text from one file to another you should use an unnamed buffer. from where it currently is. While it is easy to get back with the command . these commands would still be frustrating if they were easy to hit accidentally. You can easily change or extend this set of macros by assigning a different string lo the paragraphs option See section 6.2 for details. The *.bp’ directive is also considered to start a paragraph. in your EXINIT. An Introduction to Display Editing with Vi 3-63 4.4. Summary. ! $ ) first non-white on line end of line forward sentence } 1] ( forward paragraph forward section backward sentence backward paragraph backward section fx p y tx Fx P Tx find x forward in line put text back, after cursor or below current line yvank operator, for copies and moves up to x forward, for operators f backward in line put text back, before cursor or above current line t backward in line { [ 5. High level commands 5.1. Writing, quitting, editing new files So far we have seen how to enter vi and to write out our file using either ZZ or :wCR. The first exits from the editor, (writing if changes were made), the second writes and stays in the editor. If you have changed the editor’s copy of the file but do not wish to save your changes, either because you messed up the file or decided that the changes are not an improvement to the file, then you can give the command :q!CR to quit from the editor without writing the changes. You can also reedit the same file (starting over) by giving the command :e!CR. These commands should be used only rarely, and with caution, as it is not possible to recover the changes you have made after you discard them in this manner. You can edit a different file without leaving the editor by giving the command :e nameCR. If you have not written out your file before you try to do this, then the editor will tell you this, and delay editing the other file. You can then give the command :wCR to save your work and then the :e nameCR command again, or carefully give the command :e! nameCR, which edits the other file discarding the changes you have made to the current file. To have the editor automatically save changes, include ser autowrite in your EXINIT, and use :n instead of :e. 5.2. Escaping to a shell You can get to a shell to execute a single command by giving a vi command of the form :'!cmdcR. The system will run the single command ¢md and when the command finishes, the editor will ask you to hit a RETURN to continue. When you have finished looking at the output on the screen, you should hit RETURN and the editor will clear the screen and redraw it. You can then continue editing. You can also give another : command when it asks you for a RETURN; in this case the screen will not be redrawn. If you wish to execute more than one command in the shell, then you can give the command :shCR. This will give you a new shell, and when you finish with the shell, ending it by typing a ~D, the editor will clear the screen and continue. On systems which support it, "Z will suspend the editor and return to the (top level) shell. When the editor is resumed, the screen will be redrawn. 3-64 An Introduction to Display Editing with Vi 5.3. Marking and returning The command TM returned to the previous place after a motion of the cursor by a command such as /, ? or G. You can also mark lines in the file with single letter tags and return to these marks later by naming the tags. Try marking the current line with the command mx, where you should pick some letter for x, say ‘a’. Then move the cursor to a different line (any way you like) and hit ‘a. The cursor will return to the place which you marked. only until you edit another file. Marks last When using operators such as d and referring to marked lines, it is often desirable to delete whole lines rather than deleting to the exact position in the line marked by m. In this case you can use the form “x rather than "x. Used without an operator, “x will move to the first non-white character of the marked line; similarly ” moves to the first non-white character of the line containing the previous context mark TM. 5.4. Adjusting the screen If the screen image is messed up because of a transmission error to your terminal, or because some program other than the editor wrote output to your terminal, you can hit a "L, the ascCll form-feed character, to cause the screen to be refreshed. On a dumb terminal, if there are @ lines in the middle of the screen as a result of line deletion, you may get rid of these lines by typing "R to cause the editor to retype the screen, closing up these holes. Finally, if you wish to place a certain line on the screen at the top middle or bottom of the screen, you can position the cursor to that line, and then give a z command. You should follow the z command with a RETURN if you want the line to appear at the top of the window, a . if you want it at the center, or a = if you want it at the bottomn. able on all v2 editors.) 6. (z., z-, and z+ are not avail- Special topics 6.1. Editing on slow terminals When you are on a slow terminal, it is important to limit the amount of output which is generated to your screen so that you will not suffer long delays, waiting for the screen to be refreshed. We have already pointed out how the editor optimizes the updating of the screen during insertions on dumb terminals to limit the delays, and how the editor erases lines to @ when they are deleted on dumb terminals. The use of the slow terminal insertion mode is controlled by the slowopen option. You can force the editor to use this mode even on faster terminals by giving the command :se slowCR. If your system is stuggish this helps lessen the amount of output coming to your terminal. You can disable this option by :se noslowcCR. The editor can simulate an intelligent terminal on a dumb one. :se redrawCR. Try giving the command This simulation generates a great deal of output and is generally tolerable only on lightly loaded systems and fast terminals. You can disable this by giving the command :se noredrawcCR. The editor also makes editing more pleasant at low speed by starting editing in a small window, and letting the window expand as you edit. This works particularly well on intelligent The editor can expand the window easily when you insert in the middle of the screen on these terminals. If possible, try the editor on an intelligent terminal to see how this terminals. works. You can control the size of the window which is redrawn each time the screen is cleared by giving window sizes as argument to the commands which cause large screen motions: /2N Thus if you are searching for a particular instance of a common string in a file you can precede An Introduction to Display Editing with Vi 3-65 the first search command by a small number, say 3, and the editor will draw three line windows around each instance of the string which it locates. You can easily expand or contract the window, placing the current line as you choose, by giving a number on a z command, after the z and before the following RETURN, . or =. Thus the command z3. redraws the screen with the current line in the center of a five line window.¥ If the editor is redrawing or otherwise updating large portions of the display, you can interrupt this updating by hitting a DEL or RUB as usual. If you do this you may partially confuse the editor about what is displayed on the screen. You can still edit the text on the screen if you wish; clear up the confusion by hitting a "L, or move or search again, ignoring the current state of the display. See section 7.8 on open mode for another way to use the vi command set on slow terminals. 6.2. Options, set, and editor startup files The editor has a set of options, some of which have been mentioned above. The most useful options are given in the following table. Name Default Description autoindent autowrite ignorecase noai noaw noic Supply indentation automatically Automatic write before :n, :ta, ', ! Ignore case in searching nolist nomagic nonu para=[PLPPPQPbpP LI nore sect=NHSHH HU sw=8§ Tabs print as “I; end of lines marked with S The characters . [ and * are special in scans Lines are displayed prefixed with line numbers Macro names which start paragraphs Simulate a smart terminal on a dumb one Macro names which start new sections Shift distance for <, > and input "D and T slow dumb Postpone display updates during inserts The kind of terminal you are using. lisp nolisp showmatch nosm list magic number paragraphs redraw sections shiftwidth =~ slowopen term ( {) } commands deal with S-expressions Show matching ( or { as ) or } is typed The options are of three kinds: numeric options, string options, and toggie options. You can set numeric and string options by a statement of the form set opr=val. and toggle options can be set or unset by statements of one of the forms set opt set noopt These statements can be placed in your EXINIT in your environment, or given while you are running vi by preceding them with a : and following them with a CR. W mse mmen mmb o lomd af all amtemssm serbmemle drmes oavra cliccrmad ey I 00U ¢dll 5:& d 1159t Ul dil UPLIVIIS WIiIGIL YUU [1dVEe LIlidliIglU UY flre cmsetommmee A s abr T O mm UG VULITIAIIU .SElLK, Ul tha LG value of a single option by the command :set opt?CR. A list of all possible options and their values is generated by :set allCR. Set can be abbreviated se. Muiltiple options can be placed on one line, e.g. :se ai aw nucCR. Options set by the set command only last while you stay in the editor. want to have certain options set whenever you use the editor. It is common to This can be accomplished by 9 Ao ~ s1a g . ; creating a lsat list of£ ex commandst which are to be run every time you start up ex, edit, or vi.; + Note that the command 5z. has an entirely different effect, placing line 5 in the center of a new window. t All commands which start with : are ex commands. A 3-66 An Introduction to Display Editing with Vi typical list includes a set command, and possibly a few map commands (on v3 editors). Since it is advisable to get these commands on one line, they can be separated with the | character, for example: set ai aw terseimap @ ddmap # x which sets the options autoindent, autowrite, terse, (the set command), makes @ delete a line, (the first map), and makes # delete a character, (the second map). (See section 6.9 for a description of the map command, which only works in version 3.) This string should be placed in the variable EXINIT in your environment. If you use csh, put this line in the file ./login in your home directory: setenv EXINIT “set ai aw tersemap @ ddimap # x’ If you use the standard v7 shell, put these lines in the file .profile in your home directory: EXINIT ="set ai aw tersemap @ ddmap # x’ export EXINIT On a version 6 system, the concept of environments is not present. In this case, put the line in the file .exrc in your home directory. set ai aw tersemap @ ddmap # x Of course, the particulars of the line would depend on which options you wanted to set. 6.3. Recovering lost lines You might have a serious problem if you delete a number of lines and then regret that Despair not, the editor saves the last 9 deleted blocks of text in a set of they were deleted. numbered registers 1—9. You can get the n’th previous deleted text back in your file by the command "np. The " here says that a buffer name is to follow, n is the number of the buffer you wish to try (use the number 1 for now), and p is the put command, which puts text in the buffer after the cursor. If this doesn’t bring back the text you wanted, hit u to undo this and then . (period) to repeat the put command. In general the . command will repeat the last change you made. As a special case, when the last command refers to a numbered text buffer, the . command increments the number of the buffer before repeating the command. Thus a sequence of the form "lpu.u.u. will, if repeated long enough, show you all the deleted text which has been saved for you. You can omit the u commands here to gather up all this text in the buffer, or stop after any . command to keep just the then recovered text. The command P can also be used rather than p to put the recovered text before rather than after the cursor. 6.4. Recovering lost files If the system crashes, you can recover the work you were doing to within a few changes. You will normally receive mail when you next login giving you the name of the file which has been saved for you. You should then change to the directory where you were when the system crashed and give a command of the form: % vi —r name replacing name with the name of the file which you were editing. This will recover your work to a point near where you left off.t * In rare cases. some of the lines of the file may be lost. The editor will give you the numbers of these lines and the text of the lines will be replaced by the string ‘LOST". last few which you changed. These lines will almost always be among the You can either choose to discard the changes which you made (if they are easy to remake) or to replace the few lost lines by hand. An Introduction to Display Editing with Vi 3-67 You can get a listing of the files which are saved for you by giving the command: % vi =—r If there is more than one instance of a particular file saved, the editor gives you the newest instance each time you recover it. You can thus get an older saved copy back by first recover- ing the newer copies. For this feature to work, v must be correctly installed by a super user on your system, and the mail program must exist to receive mail. The invocation ‘‘vi -/’ will not always list all saved files, but they can be recovered even if they are not listed. 6.5. Continuous text input When you are typing in large amounts of text it is convenient to have lines broken near the right margin automatically. wm=10CR. You can cause this to happen by giving the command :se This causes all lines to be broken at a space at least 10 columns from the right hand edge of the screen.” If the editor breaks an input line and you wish to put it back together you can tell it to join the lines with J. You can give J a count of the number of lines to be joined as in 3J to join 3 lines. The editor supplies white space, if appropriate, at the juncture of the joined lines, and leaves the cursor at this white space. You can kill the white space with x if you don’t want it. 6.6. Features for editing programs The editor has a number of commands for editing programs. The thing that most distin- guishes editing of programs from editing of text is the desirability of maintaining an indented structure to the body of the program. The editor has a guroindent facility for helping you gen- erate correctly indented programs. To enable this facility you can give the command :se aiCR. with o and type some characters on the line after a few tabs. Now try opening a new line If you now start another line, notice that the editor supplies white space at the beginning of the line to line it up with the previous line. You cannot backspace over this indentation, but you can use "D key to backtab over the supplied indentation. Each time you type "D you back up one position, normally to an 8 column boundary. This amount is settable; the editor has an option called shiftwidth which you can set to change this value. Try giving the command :se sw=4CR and then experimenting with autoindent again. For shifting lines in the program left and right, there are operators < and >. These shift Try < < and > > which shift one line left the lines you specify right or left by one saifiwidth. or right, and <L and >L shifting the rest of the display left and right. If you have a complicated expression and wish to see how the parentheses match, put the cursor at a left or right parenthesis and hit %. This will show you the matching parenthesis. This works also for braces { and }, and brackets [ and ]. If you are editing C programs, you can use the || and ]l keys to advance or retreat to a line starting with a {, i.e. a function declaration at a time. When ]] is used with an operator it stops after a line which starts with }; this is sometimes useful with yl]. * This feature is not availabie on some v2 editors. & e BN In v2 editors where it is available, the break can only oc- cur to the right of the specified boundary instead of to the left. 3-68 An Introduction to Display Editing with Vi 6.7. Filtering portions of the buffer You can run system commands over portions of the buffer using the operator !. You can use this to sort lines in the buffer, or to reformat portions of the buffer with a pretty-printer. Try typing in a list of random words, one per line and ending them with a blank line. Back up to the beginning of the list, and then give the command !}sortCR. This says to sort the next paragraph of material, and the blank line ends a paragraph. 6.8. Commands for editing LISPT If you are editing a LISP program you should set the option /lisp by doing :se lispCR. This changes the ( and ) commands to move backward and forward over s-expressions. The { and | commands are like ( and ) but don’t stop at atoms. These can be used to skip to the next list, or through a comment quickly. The autoindent option works differently for LISP, supplying indent to align at the first argument to the last open list. If there is no such argument then the indent is two spaces more than the last level. There is another option which is useful for typing in LISP, the showmatch option. Try setting it with :se smCR and then try typing a ‘(’ some words and then a ‘)’. Notice that the cursor shows the position of the ‘(" which matches the ‘)’ briefly. This happens only if the matching ‘(" is on the screen, and the cursor stays there for at most one second. The editor also has an operator to realign existing lines as though they had been typed in with lisp and autoindent set. This is the = operator. Try the command =% at the beginning of a function. This will realign all the lines of the function declaration. When you are editing LISP,, the [ and ]] advance and retreat to lines beginning with a (| and are useful for dealing with entire function definitions. 6.9. Macrost Vi has a parameterless macro facility, which lets you set it up so that when you hit a single keystroke, the editor will act as though you had hit some longer sequence of keys. You can set this up if you find yourself typing the same sequence of commands repeatedly. Briefly, there are two flavors of macros: a) Ones where you put the macro body in a buffer register, say x. You can then type @x to. invoke the macro. The @ may be followed by another @ to repeat the last macro. b) You can use the map command from vi (typically in your EXINIT) with a command of the form: :map lhs rhsCR mapping lhs into rhs. There are restrictions: lhs should be one keystroke (either 1 character or one function key) since it must be entered within one second (unless notimeout is set, in which case you can type it as slowly as you wish, and v/ will wait for you to finish it before it echoes anything). The /hs can be no longer than 10 characters, the rhs no longer than 100. To get a space, tab or newline into /hs or rhs you should escape them with a "V. (It may be necessary to double the "V if the map command is given inside v/, rather than in ex.) Spaces and tabs inside the rAs need not be escaped. Thus to make the q key write and exit the editor, you can give the command 'map q :wq Y VCR CR which means that whenever you type q, it will be as though you had typed the four characters :wqCR. A "V’s is needed because without it the CR would end the : command, rather than t The Lisp features are not available on some v2 editors due (0 memory constraints. t The macro feature is available only in version 3 editors. An Introduction to Display Editing with Vi 3-69 becoming part of the map definition. There are two "V's because from within vi. two "V’s must be typed to get one. The first CR is part of the rhs, the second terminates the : command. Macros can be deleted with unmap lhs If the /As of a macro is “#0°’ through “#9”, this maps the particular function key instead of the 2 character “‘#"° sequence. So that terminals without function keys can access such definitions, the form “‘#x’’ will mean function key x on all terminals (and need not be typed within one second.) The character “#’’ can be changed by using a macro in the usual way: map V'V'I # to use tab, for example. (This won’t affect the map command, which still uses #, but just the invocation from visual mode. The undo command reverses an entire macro call as a unit, if it made any changes. Placing a ‘!" after the word map causes the mapping to apply to input mode, command mode. type: rather than Thus, to arrange for "T to be the same as 4 spaces in input mode, you can :map ‘T "VbbbH where ¥ is a blank. The "V is necessary to prevent the blanks from being taken as white between the /hs and rhs. 7. space | Word Abbreviations tt A feature similar to macros in input mode is word abbreviation. This allows you to type a short word and have it expanded into a longer word or words. The commands are :abbreviate and :unabbreviate (:ab and :una) and have the same syntax as :map. For example: :ab eecs Electrical Engineering and Computer Sciences causes the word ‘eecs’ to always be changed into the phrase ‘Electrical Engineering and ComWord abbreviation is different from macros in that only whole words are puter Sciences’. affected. If ‘eecs’ were typed as part of a larger word, it would be left alone. word is echoed as it is typed. it should be with a macro. 7.1. Also, the partial There is no need for an abbreviation to be a single keystroke, as Abbreviations The editor has a number of short commands which abbreviate longer commands which we have introduced here. You can find these commands easily on the quick reference card. They often save a bit of typing and you can learn them as convenient. 8. Nitty-gritty details 8.1. Line representation in the display The editor folds long logical lines onto many physical lines in the display. Commands which advance lines advance logical lines and will skip over all the segments of a line in one motion. The command | moves the cursor to a specific column, and may be useful for getting near the middle of a long line to split it in half. columns long.t Try 80| on a line which is more than 80 The editor only puts full lines on the display; if there is not enough room on the display to fit a logical line, the editor leaves the physical line empty, placing only an @ on the line as a $ t Version 3 . only. t You can make long lines very easily by using J (o join together short lines. 3-70 An Introduction to Display Editing with Vi place holder. When you delete lines on a dumb terminal, the editor will often just clear the lines to @ to save time (rather than rewriting the rest of the screen.) You can always maximize the information on the screen by giving the "R command. If you wish, you can have the editor place line numbers before each line on the display. Give the command :se nuCR to enable this, and the command :se nonucCR to turn it off. You can have tabs represented as "I and the ends of lines indicated with ‘S’ by giving the command :se listCR; :se nolistCR turns this off. Finally, lines consisting of only the character ‘= are displayed when the last line in the file These represent physical lines which are past the logical end of is in the middle of the screen. file. 8.2. Counts Most vi commands will use a preceding count to affect their behavior in some way. following table gives the common ways in which the counts are used: new window size scroll amount /210NN ‘D U line/column number z G | repeat effect The most of the rest The editor maintains a notion of the current default window size. On terminals which run at speeds greater than 1200 baud the editor uses the full terminal screen. On terminals which are slower than 1200 baud (most dialup lines are in this group) the editor uses 8 lines as the default window size. At 1200 baud the default is 16 lines. This size is the size used when the editor clears and refills the screen after a search or other motion moves far from the edge of the current window. The commands which take a new window size as count all often cause the screen to be redrawn. If you anticipate this, but do not need as large a window as you are currently using, you may wish to change the screen size by specifying the new size before these commands. In any case, the number of lines used on the screen wiil expand if you move off the top with a = or similar command or off the bottom with a command such as RETURN or "D. the next time it is cleared and refilled.t The window will revert to the last specified size The scroll commands "D and “U likewise remember the amount of scroll last specified, using half the basic window size initially. The simple insert commands use a count to specify a repetition of the inserted text. Thus 10a+ ————ESC will insert a grid-like string of text. A few commands also use a preceding count as a line or column number. Except for a few commands which ignore any counts (such as “R), the rest of the editor commands use a count to indicate a simple repetition of their effect. Thus Sw advances five words on the current line, while SRETURN advances five lines. A very useful instance of a count as a repetition is a count given to the . command, which repeats the last changing com- mand. If you do dw and then 3., you will delete first one and then three words. delete two more words with 2.. You can then 8.3. More file manipulation commands in vi. The following table lists the file manipulation commands which you can use when you are All of these commands are followed by a CR or ESC. The most basic commands are :w and :e. A normal editing session on a single file will end with a ZZ command. If you are editing for a long period of time you can give :w commands occasionally after major amounts of editing, and then finish with a ZZ. When you edit more than one file, you can finish with one t But not by a "L which just redraws the screen as it is. An Introduction to Display Editing with Vi 3-71 W write back changes ‘WQ write and quit :X write (if necessary) and quit (same as ZZ). ‘e name e! ‘e + name e +n e # ‘W name w! name edit file name reedit, discarding changes edit, starting at end edit, starting at line » edit alternate file write file name overwrite file name :x,yw name write lines x through y to name :T name v lemd n :n! ‘N args :ta tag read file name into buffer read output of ¢md into buffer edit next file in argument list edit next file, discarding changes to current specify new argument list edit file containing tag tag, at rag - with a :w and start editing a new file by giving a :@ command, or set autowrite and use :n <file>. If you make changes to the editor’s copy of a file, but do not wish to write them back. then you must give an ! after the command you would otherwise use; this forces the editor to discard any changes you have made. Use this carefully. The :e command can be given a + argument to start at the end of the file, or a +n argument to start at line n. In actuality, n» may be any editor command not containing a space, usefully a scan like 4/pat or +?pat. In forming new names to the e command, you can use the character % which is replaced by the current file name, or the character # which is replaced by the alternate file name. The alternate file name is generally the last name you typed other than the current file. Thus if you try to do a :e and get a diagnostic that you haven’t written the file, you can give a :w command and then a :e # command to redo the previous :e. You can write part of the buffer to a file by finding out the lines that bound the range to be written using "G, and giving these numbers after the : and before the w, separated by ,’s. You can also mark these lines with m and then use an address of the form "x,"y on the w command here. You can read another file into the buffer after the current line by using the :r command. You can similarly read in the output from a command, just use !cmd instead of a file name. If you wish to edit a set of files in succession, you can give all the names on the command It is also possible to respecify the list of files to be edited by giving the :n command a list of file names, or a pattern to be line, and then edit each one in turn using the command :n. expanded as you would have given it on the initial vi command. If you are editing large programs, you will find the :ta command very useful. It utilizes a data base of function names and their locations, which can be created by programs such as ctags, to quickly find a function whose name you give. If the :ta command will require the editor to switch files, then you must :w or abandon any changes before switching. You can repeat the :ta command without any arguments to look for the same tag again. available in some v2 editors.) (The tag feature is not 8.4. More about searching for strings hen you are searching for strings in the file with / and ?, the editor normally places you If you are using an operator such as d, c or y, at the next or previous occurrence of the strifig. then you may well wish to affectlines up to the line before the line containing the pattern. 3-72 An Introduction to Display Editing with Vi You can give a search of the form /pat/—n to refer to the »'th line before the next line conIf taining par, or you can use <+ instead of = to refer to the lines after the one containing par. you don’t give a line offset, then the editor will affect characters up to the match place, rather than whole lines; thus use ‘40’ to affect to the line which matches. You can have the editor ignore the case of words in the searches it does by giving the command :se icCR. The command :se noicCR turns this off. Strings given to searches may actually be regular expressions. If you do not want or need this facility, you should set nomagic in your EXINIT. In this case, only the characters | and $ are special in patterns. The character \ is also then special (as it is most everywhere in the system), and may be used to get at the an extended pattern matching facility. It is also necessary to use a \ before a / in a forward scan or a ? in a backward scan, in any case. magic is set. ! The following table gives the extended forms when at beginning of pattern, matches beginning of line $ at end of pattern, matches end of line | matches any character \ < \> matches the beginning of a word matches the end of a word [ser] [1 s [x=y] matches any single character in str matches any single character not in str matches any character between x and y » matches any number of the preceding pattern If you use nomagic mode, then the . [ and * primitives are given with a preceding \. 8.5. ;Mar'e about input mode There are a number of characters which you can use to make corrections during input These are summarized in the following table. mode. "H W deletes the last input character deletes the last input word, defined as by b erase your erase character, same as "H kill \ your Kkill character, deletes the input on this line escapes a following "H and your erase and kill ESC ends an insertion DEL interrupts an insertion, terminating it abnormally CR starts a new line "D 0°D 1°D Y backtabs over auroindent kills all the autoindent same as 0D, but restores indent next line quotes the next non-printing character into the file The most usual way of making corrections to input is by typing "H to correct a single If you use # as your character, or by typing one or more “W’s to back over incorrect words. erase character in the normal system, it will work like "H. Your system kill character, normally @, "X or "U, will erase all the input you have given on the current line. In general, you can neither erase input back around a line boundary nor can & - you erase c...nmm.erfi which you did not insert with this insertion command. P = - ? 1 In e =% o after a new line . o ° ° To make . has been started you can hit ESC to end the insertion, move over and make the correction, and then return to where you were to continue. An Introduction to Display Editing with Vi 3-73 The command A which appends at the end of the current line is often useful for continuing. If you wish to type in your erase or kill character (say # or @) then you must precede it a with \, just as you would do at the normal system command level. A more general way of typing non-printing characters into the file is to precede them with a "V. The "V echoes as a | character on which the cursor rests. This indicates that the editor expects you to type a control character. In fact you may type any character and it will be inserted into the file at that point.” If you are using autoindent you can backtab over the indent which it supplies by typing a “D. This backs up to a shiftwidth boundary. This only works immediately after the supplied autoindent. When you are using autoindent you may wish to place a label at the left margin of a line. The way to do this easily is to type | and then “D. The editor will move the cursor to the left margin for one line, and restore the previous indent on the next. You can also type a 0 followed immediately by a "D if you wish to kill all the indent and not have it come back on the next line. 8.6. Upper case only terminals If your terminal has only upper case, you can still use vi by using the normal system convention for typing on such a terminal. Characters which you normally type are converted to lower case, and you can type upper case letters by preceding them with a \. The characters { =] | * are not available on such terminals, but you can escape them as \(\{ \) \! \". These characters are represented on the display in the same way they are typed.f ¢ 8.7. Vi and ex Vi is actually one mode of editing within the editor ex. When you are running v/ you can escape to the line oriented editor of ex by giving the command Q. All of the : commands which were introduced above are available in ex. Likewise, most ex commands can be invoked from vi using :. Just give them without the : and follow them with a CR. In rare instances, an internal error may occur in vi. In this case you will get a diagnostic and be left in the command mode of ex. You can then save your work and quit if you wish by giving a command x after the : which ex prompts you with, or you can reenter v/ by giving ex a vi command. There are a number of things which you can do more easily in ex than in vi. Systematic changes in line oriented material are particularly easy. You can read the advanced editing documents for the editor ed to find out a lot more about this style of editing. Experienced users often mix their use of ex command mode and vi command mode to speed the work they are doing. 8.8. Open mode: vi on hardcopy terminals and ‘“‘glass tty’s’ # If you are on a hardcopy terminal or a terminal which does not have a cursor which can move off the bottom line, you can still use the command set of v/, but in a different mode. When you give a vi command, the editor will teil you that it is using open mode. comes from the open command in ex, which is used to get into the same mode. This name The only difference between visual/ mode and open mode is the way in which the text is * This is not quite true. The impiementation of the editor does not allow the NuLL ("@) character to appear in files. Also the LF (linefeed or "J) character is used by the editor to separate lines in the file. so it cannot appear in the middle of a line. You can insert any other character, however, if you wait for the editor to echo the | before you type the character. In fact, the editor will treat a following letter as a request for the corresponding control character. This is the only way to type 'S or “Q, since the system normally uses them to suspend and resume output and never gives them to the editor (o process. $ The \ character you give will not echo until you type another key. $ Not available in all v2 editors due to memory constraints. 3-74 An Introduction to Display Editing with Vi displayed. In open mode the editor uses a single line window into the file, and moving backward and forward in the file causes new lines to be displayed, always below the current line. Two commands of vi work differently in open: z and "R. The z command does not take parameters. but rather draws a window of context around the current line and then returns you to the current line. If you are on a hardcopy terminal, the "R command will retype the current line. terminals, the editor normally uses two lines to represent the current line. On such The first line is a copy of the line as you started to edit it, and you work on the line below this line. When you delete characters, the editor types a number of \’s to show you the characters which are deleted. The editor also reprints the current line soon after such changes so that you can see what the line looks like again. It is sometimes useful to use this mode on very slow terminals which can support vi in the full screen mode. You can do this by entering ex and using an open command. Acknowledgements Bruce Englar encouraged the early development of this display editor. Peter Kessler Bill Joy wrote versions | and 2.0 through helped bring sanity to version 2’s command layout. 2.7, and created the framework that users see in the present editor. Mark Horton added macros and other features and made the editor work on a large number of terminals and Unix systems. An Introduction to Display Editing with Vi 3-75 Appendix: character functions This appendix gives the uses the editor makes of each character. The characters are presented in their order in the ASCll character set: Control characters come first, then most special characters, then the digits, upper and then lower case characters. For each character we tell a meaning it has as a command and any meaning it has during an insert. If it has only meaning as a command, then only this is discussed. Section numbers in parentheses indicate where the character is discussed; a ‘f” after the section number means that the character is mentioned in a footnote. ‘@ Not a command character. If typed as the first character of an insertion it is replaced with the last text inserted, and the insert terminates. Only 128 characters are saved from the last insert; if more characters were inserted the mechanism is not available. A “@ cannot be part of the file due to the editor implementation (7.5f). Unused. Backward window. A count specifies repetition. kept if possible (2.1, 6.1, 7.2). Two lines of continuity are Unused. As a command, scrolls down a half-window of text. A count gives the number of (logical) lines to scroll, and is remembered for future "D and U commands (2.1, 7.2). During an insert, backtabs over autoindent white space at the begin- ning of a line (6.6, 7.5); this white space cannot be backspaced over. Exposes one more line below the current screen in the file, leaving the cursor where it is if possible. (Version 3 only.) Forward window. A count specifies repetition. kept if possible (2.1, 6.1, 7.2). Two lines of continuity are Equivalent to :fCR, printing the current file, whether it has been modified, the current line number and the number of lines in the file, and the percentage of the way through the file that you are. “H (BS) Same as left arrow. (See h). During an insert, eliminates the last input character, backing over it but not erasing it; it remains so you can see what you typed if you wish to type something only slightly different (3.1, 7.5). "1 (TAB) Not a command character. When inserted it prints as some number of spaces. When the cursor is at a tab character it rests at the last of the spaces which represent the tab. 4.1, 6.6). The spacing of tabstops is controlled by the rabsrop option “J (LF) Same as down arrow (see j). K Unused. "L The ascll formfeed character, this causes the screen to be cleared and redrawn. This is useful after a transmission error, if characters typed by a program other than the editor scramble the screen, or after output is stopped by an interrupt (5.4, 7.2). A carriage return advances to the next line, at the first non-white position in the line. Given a count, it advances that many lines (2.3). During an insert, a CR causes the insert to continue onto another line (3.1). Same as down arrow (see j). Unused. 3-76 An Introduction to Display Editing with Vi P Same as up arrow (see k). Q Not a command character. In input mode, “Q quotes the next character. the same as "V, except that some teletype drivers will eat the “Q so that the editor never sees it. Redraws the current screen, eliminating logical lines not corresponding to physical lines (lines with only a single @ character on them). On hardcopy termi- nals in open mode, retypes the current line (5.4, 7.2, 7.8). Unused. Some teletype drivers use “S to suspend output until “Qis Not a command character. During an insert, with autoindent set and at the beginning of the line, inserts shiftwidrth white space. Scrolls the screen up, inverting "D which scrolls down. Counts work as they do for "D, and the previous scroll amount is common to both. On a dumb terminal, U will often necessitate clearing and redrawing the screen further back in the file (2.1, 7.2). Not a command character. In input mode, quotes the next character so that it is possible to insert non-printing and special characters into the file (4.2, 7.5). Not a command character. During an insert, backs up as b would in command mode; the deleted characters remain on the display (see “H) (7.5). Unused. Exposes one more line above the current screen, leaving the cursor where it is if possible. (No mnemonic value for this key; however, it is next to “U which scrolls up a bunch.) (Version 3 only.) If supported by the Unix system, stops the editor, exiting to the top level shell. Same as :stopCR. Otherwise, unused. “ (EsC) Cancels a partially formed command, such as a z when no following character has yet been given; terminates inputs on the last line (read by commands such as : / and ?); ends insertions of new text into the buffer. If an ESC is given when quiescent in command state, the editor rings the bell or flashes the screen. You can thus hit ESC if you don’t know what is happening till the edi- tor rings the bell. If you don’t know if you are in insert mode you can type ESCa, and then material to be input; the material will be inserted correctly whether or not you were in insert mode when you started (1.5, 3.1, 7.5). Unused. Searches for the word which is after the cursor as a tag. 'ta, this word, and then a CR. (7.3). Equivalent to typing Mnemonically, this command is “‘go right to”’ Equivalent to :e #CR, returning to the previous position in the last edited file. or editing a file which you specified if you got a ‘No write since last change diagnostic’ and do not want to have to type the file name again (7.3). (You have to do a :w before “1 will work in this case. If you do not wish to write the file you should do :e! #CRrR iristeaq., Unused. Reserved as the command character for the Tektronix 4025 and 4027 terminal. SPACE Same as right arrow (see 1). An operator, which processes lines from the buffer with reformatting commands. Follow ! with the object to be processed, and then the command name terminated by CR. Doubling ! and preceding it by a count causes count lines to be filtered; otherwise the count is passed on to the object after the !. Thus 2!} fmicR reformats the next two paragraphs by running them through the program fmr. If you are working on LISP, the command !%grindCR,* given at the An Introduction to Display Editing with Vi 3-77 beginning of a function, will run the text of the function through the Lisp grinder (6.7, 7.3). To read a file or the output of a command into the buffer use :r (7.3). To simply execute a command use :! (7.3). Precedes a named buffer specification. There are named buffers 1 =9 used for saving deleted text and named buffers a=z into which you can place text (4.3, 6.3) The macro character which, when followed by a number, will substitute for a function key on terminals without function keys (6.9). In input mode, if this is your erase character, it will delete the last character you typed in input mode, and must be preceded with a \ to insert it, since it normally backs over the last input character you gave. Moves to the end of the current line. If you :se listCR, then the end of each line will be shown by printing a § after the end of the displayed text in the line. Given a count, advances to the count’th following end of line; thus 23 advances to the end of the following line. Moves to the parenthesis or brace { } which balances the parenthesis or brace at the current cursor position. A synonym for :&CR, by analogy with the ex & command. When followed by a * returns to the previous context at the beginning of a line. The previous context is set whenever the current line is moved in a non-relative way. When followed by a letter a—z, returns to the line which was marked with this letter with a m command, at the first non-white character in the line. (2.2, 5.3). When used with an operator such as d, the operation takes place over complete lines; if you use °, the operation takes place from the exact marked place to the current cursor position within the line. Retreats to the beginning of a sentence, or to the beginning of a LISP sexpression if the lisp option is set. A sentence ends at a . ! or ? which is followed by either the end of a line or by two spaces. Any number of closing ) | " and * characters may appear after the . ! or ?, and before the spaces or end of line. Sentences also begin at paragraph and section boundaries (see { and [l below). A count advances that many sentences (4.2, 6.8). Advances to the beginning of a sentence. A count repeats the effect. above for the definition of a sentence (4.2, 6.8). See ( Unused. Same as CR when used as a command. Reverse of the last f F t or T command, looking the other way in the current line. Especially useful after hitting too many ; characters. A count repeats the search. Retreats to the previous line at the first non-white character. This is the inverse of + and RETURN. If the line moved to is not on the screen, the e m creallad § e e Ve ~ screen is12 scrolled, or cleared and redrawn if this is not possible. If a large amount of scrolling would be required the screen is also cleared and redrawn, with the current line at the center (2.3). Repeats the last command which changed the buffer. Especially useful when deleting words or lines; you can delete some words/lines and then hit . to delete more and more words/lines. Given a count, it passes it on to the command being walkd 7 4) f e T J o repeated. Avyb bw b o Thus A& A2 b after a 2dw, 3. deletes b by wd three & Sy by words (3.3, 6.3, 7.2, 3-78 An Introduction to Display Editing with Vi / Reads a string from the last line on the screen, and scans forward for the next occurrence of this string. The normal input editing sequences may be used during the input on the bottom line; an returns to command state without ever searching. The search begins when you hit CR to terminate the pattern; the cursor moves to the beginning of the last line to indicate that the search is in progress; the search may then be terminated with a DEL or RUB, or by back- spacing when at the beginning of the bottom line, returning the cursor to its initial position. Searches normally wrap end-around to find a string anywhere in the buffer. When used with an operator the enclosed region is normally affected. By mentioning an offset from the line matched by the pattern you can force whole lines to be affected. To do this give a pattern with a closing a closing / and then an offset +nor —n. To include the character / in the search string, you must escape it with a preceding \. A 1 at the beginning of the pattern forces the match to occur at the beginning of a line only; this speeds the search. A $ at the end of the pat- tern forces the match to occur at the end of a line only. More extended pat- tern matching is available, see section 7.4; unless you set nomagic in your .exrc file you will have to preceed the characters . [ * and ~ in the search pattern with a \ to get them to work as you would naively expect (1.5, 2.2, 6.1, 7.2, 7.4). Moves to the first character on the current line. numbers, after an initial 1-—9. Also used, in forming Used to form numeric arguments to commands (2.3, 7.2). A prefix to a set of commands for file and option manipulation and escapes to the system. Input is given on the bottom line and terminated with an CR, and the command then executed. You can return to where you were by hitting DEL or RUB if you hit : accidentally (see primarily 6.2 and 7.3). Repeats the last single character find which used f F t or T. the basic scan (4.1). A count iterates An operator which shifts lines left one shiftwidth, normally 8 spaces. operators, affects lines when repeated, as in < <. Like all Counts are passed through to the basic object, thus 3< < shifts three lines (6.6, 7.2). Reindents line for LISP, as though they were typed in with /isp and autoindent set (6.8). An operator which shifts linés right one shiftwidth, normally 8 spaces. lines when repeated as in > >. Scans backwards, the opposite of /. scanning (2.2, 6.1, 7.4). A macro character (6.9). Affects Counts repeat the basic object (6.6, 7.2). See the / description above for details on If this is your kill character, you must escape it with a \ to type it in during input mode, as it normally backs over the input you g have given on the current line (3.1, 3.4, 7.5). Appends at the end of line, a synonym for Sa (7.2). Backs up a word, where words are composed of non-blank sequences. placing the cursor at the beginning of the word. A count repeats the effect (2.4). - Changes the rest of the text on the current line; a synonym for ¢8. Deletes the rest of the text on the current line; a synonym for d$. An Introduction to Display Editing with Vi 3-79 Moves forward to the end of a word, defined as blanks and non-blanks, like B and W. A count repeats the effect. Finds a single following character, backwards in the current line. repeats this search that many times (4.1). A count Goes to the line number given as preceding argument, or the end of the file if no preceding count is given. The screen is redrawn with the new current line in the center if necessary (7.2). Home arrow. Homes the cursor to the top line on the screen. If a count is given, then the cursor is moved to the count’th line on the screen. In any case the cursor is moved to the first non-white character on the line. If used as the target of an operator, full lines are affected (2.3, 3.2). Inserts at the beginning of a line; a synonym for [i. Joins together lines, supplying appropriate white space: one space between words, two spaces after a ., and no spaces at all if the first character of the joined on line is ). A count causes that many lines to be joined rather than the PR default two (6.5, 7.1f). Unused. Moves the cursor to the first non-white character of the last line on the screen. With a count, to the first non-white of the count’th line from the bottom. Operators affect whole lines when used with L (2.3). Moves the cursor to the middle line on the screen, at the first non-white position on the line (2.3). Scans for the next match of the last pattern given to / or ?, but in the reverse direction; this is the reverse of n. | Opens a new line above the current line and inputs text there up to an ESC. A count can be used on dumb terminals to specify a number of lines to be opened; this is generally obsolete, as the slowopen option works better (3.1). Puts the last deleted text back before/above the cursor. The text goes back as whole lines above the cursor if it was deleted as whole lines. Otherwise the text is inserted between the characters before and at the cursor. May be preceded by a named buffer specification "x to retrieve the contents of the buffer: buffers 1~9 contain deleted material, buffers a—z are available for general use (6.3). Quits from vi to ex command mode. In this mode, whole lines form commands, ending with a RETURN. You can give all the : commands; the editor supplies the : as a prompt (7.7). Replaces characters on the screen with characters you type (overlay fashion). Terminates with an ESC. Changes whole lines, a synonym for cc. A count substitutes for that many lines. The lines are saved in the numeric buffers, and erased on the screen ‘ before the substitution begins. Takes a single following character, locates the character before the cursor in the current line, and places the cursor just after that character. A count repeats the effect. Most useful with operators such as d (4.1). Restores the current line to its state before you started changing it (3.3). Unused. 3-80 An Introduction to Display Editing with Vi W Moves forward to the beginning of a word in the current line, where words are defined as sequences of blank/non-blank characters. A count repeats the effect (2.4). Deletes the character before the cursor. A count repeats the effect, but only characters on the current line are deleted. Yanks a copy of the current line into the unnamed buffer, to be put back by a later p or P; a very useful synonym for yy. A count yanks that many lines. May be preceded by a buffer name to put lines in that buffer (7.4). 17 Exits the editor. (Same as :xCR.) If any changes have been made, the buffer 1s written out to the current file. Then the editor quits. Backs up to the previous section boundary. A section begins at each macro in the sections option, normally a *.NH’ or ‘*.SH’ and also at lines which which start with a formfeed “L. Lines beginning with { also stop [[: this makes it useful for looking backwards, a function at a time, in C programs. If the option /lisp is set, stops at each ( at the beginning of a line, and is thus useful for moving backwards at the top level LISP objects. (4.2, 6.1, 6.6, 7.2). Unused. | Forward to a section boundary, see (| for a definition (4.2, 6.1, 6.6, 7.2). Moves to the first non-white position on the current line (4.4). Unused. When followed by a ° returns to the previous context. The previous context is set whenever the current line is moved in a non-relative way. When followed by a letter a—z, returns to the position which was marked with this letter with a m command. When used with an operator such as d, the operation takes place from the exact marked place to the current position within the line: if you use °, the operation takes place over complete lines (2.2, §.3). Appends arbitrary text after the current cursor position; the insert can continue onto multiple lines by using RETURN within the insert. A count causes the inserted text to be replicated, but only if the inserted text is all on one line. The insertion terminates with an esc (3.1, 7.2). Backs up to the beginning of a word in the current line. A word is a sequence of alphanumerics, or a sequence of special characters. A count repeats the effect (2.4). | An operator which changes the following object, replacing it with the following input text up to an ESC. If more than part of a single line is affected. the text which is changed away is saved in the numeric named buffers. If only part of the current line is affected, then the last character to be changed away is marked with a 8. A count causes that many objects to be affected. thus both 3¢) and ¢3) change the following three sentences (7.4). An operator which deletes the following object. If more than part of a line is affected, the text is saved in the numeric buffers. A count causes that many objects to be affected; thus 3dw is the same as d3w (3.3, 3.4, 4.1. 7.4). Advances to the end of the next word, defined as for b and w. repeats the effect (2.4, 3.1). A count gu Finds the first instance of the next character following the cursor on current line. A count repeats the find (4.1). Arrow keys h, j, k, 1, and H. the An Introduction to Display Editing with Vi 3-81 Left arrow. Moves the cursor one character to the left. Like the other arrow keys, either h, the left arrow key, or one of the synonyms ("H) has the same effect. On v2 editors, arrow keys on certain kinds of terminals (those which send escape sequences, such as vt52, c100, or hp) cannot be used. A count repeats the effect (3.1, 7.5). Inserts text before the cursor, otherwise like a (7.2). [SEPY Down arrow. Moves the cursor one line down in the same column. If the position does not exist, vi comes as close as possible to the same column. Synonyms include “J (linefeed) and "N. Up arrow. Moves the cursor one line up. P is a synonym. Right arrow. Moves the cursor one character to the right. SPACE is a Synonym. Marks the current position of the cursor in the mark register which is specified by the next character a—z. Return to this position or use with an operator o B @ & using " or "~ (5.3). Repeats the last / or ? scanning commands (2.2). Opens new lines below the current line; otherwise like O (3.1). Puts text after/below the cursor; otherwise like P (6.3). Unused. Replaces the single character at the cursor with a single character you type. The new character may be a RETURN; this is the easiest way to split lines. A count replaces each of the following count characters with the single character given; see R above which is the more usually useful iteration of r (3.2). Changes the single character under the cursor to the text which follows up to an ESC; given a count, that many characters from the current line are changed. The last character to be changed is marked with $ as in ¢ (3.2). Advances the cursor upto the character before the next character typed. Most useful with operators such as d and ¢ to delete the characters up to a following character. You can use . to delete more if this doesn’t delete enough the first time (4.1). Undoes the last change made to the current buffer. If repeated, will alternate between these two states, thus is its own inverse. When used after an insert which inserted text on more than one line, the lines are saved in the numeric named buffers (3.5). Unused. Advances to the beginning of the next word, as defined by b (2.4). Deletes the single character under the cursor. With a count deletes deletes that many characters forward from the cursor position, but only on the current ‘g line (6.5). An operator, yanks the following object into the unnamed temporary buffer. If preceded by a named buffer specification, "x, the text is placed in that buffer also. Text can be recovered by a later p or P (7.4). Redraws the screen with the current line placed as specified by the following character: RETURN specifies the top of the screen, . the center of the screen, ~and = at the bottom of the screen. A count may be given after the z and before the following character to specify the new screen size for the redraw. A count before the z gives the number of the line to place in the center of the screen instead of the default current line. (5.4) 3-82 An Introduction to Display Editing with Vi { Retreats to the beginning of the beginning of the preceding paragraph. A paragraph begins at each macro in the paragraphs option, normally “.IP’, *.LP’, “PP’, “*.QP’ and ‘.bp’. A paragraph also begins after a completely empty line. and at each section boundary (see [[ above) (4.2, 6.8, 7.6). Places the cursor on the character in the column specified by the count (7.1, 7.2). Advances to the beginning of the next paragraph. paragraph (4.2, 6.8, 7.6). See { for the definition of Unused. “? (DEL) Interrupts the editor, returning it to command accepting state (1.5, 7.5) Ex Reference Manual 3-83 Ex Reference Manual Version 3.5/2.13 — September, 1980 William Joy Revised for versions 3.5/2.13 by Mark Horton Computer Science Division Department of Electrical Engineering and Computer Science University of California, Berkeley Berkeley, Ca. 94720 1. Starting ex Each instance of the editor has a set of options, which can be set to tailor it to your liking. The command edit invokes a version of ex designed for more casual or beginning users by changing the default settings of some of these options. To simplify the description which follows we assume the default settings of the options. When invoked, ex determines the terminal type from the TERM variable in the environment. It there is a TERMCAP variable in the environment, and the type of the terminal described there matches the TERM variable, then that description is used. Also if the TERMCAP variable contains a pathname (beginning with a /) then the editor will seek the description of the terminal in that file (rather than the default /etc/termcap.) If there is a variable EXINIT in the environment, then the editor will execute the commands in that variable, otherwise if there is a file .exrc in your HOME directory ex reads commands from that file, simulating a source command. Option setting commands placed in EXINIT or .exrc will be executed before each editor session. A command to enter ex has the following prototype:¥ ] name ... ex[—][-v][—-ttag][-r][-1][-wn][—x][—-R][+command The most common case edits a single file with no options, 1.e.: eX name The — command line option option suppresses all interactive-user feedback and is useful in processing editor scripts in command files. The —v option is equivalent to using vi rather than ex. The —t option is equivalent to an initial tag command, editing the file containing the tag and positioning the editor at its definition. The —r option is used in recovering after an editor or system crash, retrieving the last saved version of the named file or, if no file is specified, typing a list of saved files. The —1 option sets up for editing LISP, setting the showmatch and lisp options. The —w option sets the default window size to n, and is useful on dialups to start in small windows. The —x option causes ex to prompt for a key, which is used to encrypt and decrypt the contents of the file, which should already be encrypted using The financial support of an iBM Graduate Fellowship and the National Science Foundation under grants MCS74-07644-A03 and MCS78-07291 is gratefully acknowledged. T Brackets ‘[’ ‘]’ surround optional parameters here. 3-84 Ex Reference Manual the same key, see crypt(1). The —R option sets the readonly option at the start. arguments indicate files to be edited. I Name An argument of the form +command indicates that the editor should begin by executing the specified command. If command is omitted, then it defaults to “$”, positioning the editor at the last line of the first file initially. Other useful commands here are scanning patterns of the form “/pat” or line numbers, e.g. “+100” starting at line 100. 2. File manipulation 2.1. Current file Ex is normally editing the contents of a single file, whose name is recorded in the current file name. Ex performs all editing actions in a buffer (actually a temporary file) into which the text of the file is initially read. Changes made to the buffer have no effect on the file being edited unless and until the buffer contents are written out to the file with a write command. After the buffer contents are written, the previous contents of the written file are no longer accessible. When a file is edited, its name becomes the current file name, and its contents are read into the buffer. The current file is almost always considered to be edited. This means that the contents of the buffer are logically connected with the current file name, so that writing the current buffer contents onto that file, even if it exists, is a reasonable action. If the current file is not edited then ex will not normally write on it if it already exists.* 2.2. Alternate file Each time a new value is given to the current file name, the previous current file name is saved as the alternate file name. Similarly if a file is mentioned but does not become the current file, it is saved as the alternate file name. 2.3. Filename expansion Filenames within the editor may be specified using the normal shell expansion conven- tions. In addition, the character ‘%’ in filenames is replaced by the current file name and the character ‘#’ by the alternate file name.t 2.4. Multiple files and named buffers If more than one file is given on the command line, then the first file is edited as described above. The remaining arguments are placed with the first file in the argument list. The current argument list may be displayed with the args command. argument list may be edited with the next command. The next file in the The argument list may also be respecified by specifying a list of names to the next command. These names are expanded, the resulting list of names becomes the new argument list, and ex edits the first file on the list. For saving blocks of text while editing, and especially when editing more than one file, ex has a group of named buffers. These are similar to the normal buffer, except that only a limited number of operations are available on them. The buffers have names a through z. # 1 Not available in all v2 editors due to memory constraints. * The file command will say “[Not edited]” if the current file is not considered edited. T This makes it easy to deal alternately with two files and eliminates the need for retyping the name supplied on an edit command after a No write since last change diagnostic is received. I It is also possible to refer to A through Z; the upper case buffers are the same as the lower but commands append to named buffers rather than replacing if upper case names are used. Ex Reference Manual 3-85 2.5. Read only It is possible to use ex in read only mode to look at files that you have no intention of modifying. This mode protects you from accidently overwriting the file. Read only mode is on when the readonly option is set. It can be turned on with the —R command line option, by the view command line invocation, or by setting the readonly option. It can be cleared by setting noreadonly. It is possible to write, even while in read only mode, by indicating that you really know what you are doing. You can write to a different file, or can use the ! form of write, even while in read only mode. 3. Exceptional Conditions 3.1. Errors and interrupts When errors occur ex (optionally) rings the terminal bell and, in any case, prints an error diagnostic. If the primary input is from a file, editor processing will terminate. If an interrupt signal is received, ex prints “Interrupt” and returns to its command level. If the primary input is a file, then ex will exit when this occurs. 3.2. Recovering from hangups and crashesv If a hangup signal is received and the buffer has been modified since it was last written out, or if the system crashes, either the editor (in the first case) or the system (after it reboots in the second) will attempt to preserve the buffer. The next time you log in you should be able to recover the work you were doing, losing at most a few lines of changes from the last point before the hangup or editor crash. To recover a file you can use the —r option. If you were editing the file resume, then you should change to the directory where you were when the crash occurred, giving the command eX —I resume After checking that the retrieved file is indeed ok, you can write it over the previous contents of that file. You will normally get mail from the system telling you when a file has been saved after a crash. The command ex —r will print a list of the files which have been saved for you. will not appear in the list, although it can be recovered.) 4, (In the case of a hangup, the file Editing modes Ex has five distinct modes. The primary mode is command mode. Commands are entered in command mode when a ‘’ prompt is present, and are executed each time a complete line is sent. In text input mode ex gathers input lines and places them in the file. The append, insert, and change commands use text input mode. No prompt is printed when you are in text input mode. This mode is left by typing a ‘.’ alone at the beginning of a line, and command mode resumes. The last three modes are open and visual modes, entered by the commands of the same name, and, within open and visual modes text insertion mode. Open and visual modes allow local editing operations to be performed on the text in the file. The open command displays one line at a time on any terminal while visual works on CRT terminals with random positioning cursors, using the screen as a (single) window for file editing changes. These modes are described (only) in An Introduction to Display Editing with V. 3-86 5. Ex Reference Manual Command structure Most command names are English words, and initial prefixes of the words are acceptable abbreviations. The ambiguity of abbreviations is resolved in favor of the more commonly used commands.* 5.1. Command parameters Most commands accept prefix addresses specifying the lines in the file upon which they are to have effect. The forms of these addresses will be discussed below. A number of com- mands also may take a trailing count specifying the number of lines to be involved in the command.t Thus the command “10p” will print the tenth line in the buffer while “delete 5” will delete five lines from the buffer, starting with the current line. Some commands take other information or parameters, this information always being given after the command name.§ 5.2. Command variants A number of commands have two distinct variants. The variant form of the command is invoked by placing an ‘" immediately after the command name. Some of the default variants may be controlled by options; in this case, the ‘" serves to toggle the default. 5.3. Flags after commands The characters ‘#’, ‘p’ and ‘I’ may be placed after many commands.** In this case, the command abbreviated by these characters is executed after the command completes. normally prints the new current line after each change, ‘p’ is rarely necessary. ‘+’ or ‘=’ characters may also be given with these flags. Since ex Any number of If they appear, the specified offset is applied to the current line value before the printing command is executed. 5.4. Comments It is possible to give editor commands which are ignored. complex editor scripts for which comments are desired. quote: ”. This is useful when making The comment character is the double Any command line beginning with ” is ignored. Comments beginning with ” may also be placed at the ends of commands, except in cases where they could be confused as part of text (shell escapes and the substitute and map commands). 5.5. Multiple commands per line More than one command may be placed on a line by separating each pair of commands by & character. However the global commands, comments, and the shell escape ‘!’ must be the last command on a line, as they are not terminated by a ¢, 5.6. Reporting large changes Most commands which change the contents of the editor buffer give feedback if the scope of the change exceeds a threshold given by the report option. This feedback helps to detect undesirably large changes so that they may be quickly and easily reversed with an undo. After commands with more global effect such as global or visual, you will be informed :c "-Ll\ PPN il Uil 3 11CU 41’\_{\“”’\ Clidlig© ':M 111 J-Lr\ Ullc M--WLA-@ 11Uillucl r\c ]:“Am Ul 1110 -:-A 111 LLI\ VI Lii'mf\'l@ DUllel f]i"‘@:mm #L L Wla [} UUlllly Ullls 1Y Ve Vot 22 oo P el CULLLITIALIIU ot Y aYava r]n CACEEUDS -‘-L:q Lilldy hold. * As an example, the command substitute can be abbreviated ‘s’ while the shortest available abbreviation for the set command is ‘se’. T Counts are rounded down if necessary. 1 Examples would be option names in a set command i.e. “set number”, a file name in an edit command, a regular expression in a substitute command, or a target address for a copy command, i.e. “1,5 copy 25”. ** A ‘p’ or ‘I’ must be preceded by a blank or tab except in the single special case ‘dp’. J-L-e’\»-‘ LIIICs- Ex Reference Manual 6. 3-87 Command addressing 6.1. Addressing primitives The current line. Most commands leave the current line as the last line which they affect. The default address for most commands is the current line, thus ‘.’ is rarely used alone as an address. n The nth line in the editor’s buffer, lines being numbered sequentially from 1. $ ' The last line in the buffer. %% An abbreviation for “1,$”, the entire buffer. +n —n An offset relative to the current buffer line.t /pat/ ?pat? Scan forward and backward respectively for a line containing pat, a regular expression (as defined below). end of the buffer. The scans normally wrap around the If all that is desired is to print the next line contain- ing pat, then the trailing / or ? may be omitted. If pat is omitted or explicitly empty, then the last regular expression specified is located.} I X Before each non-relative motion of the current line ¢ ‘., the previous 9 current line is marked with a tag, subsequently referred to as “”. This makes it easy to refer or return to this previous context. Marks may also be established by the mark command, using single lower case letters x and the marked lines referred to as “x’. 6.2. Combining addressing primitives Addresses to commands consist of a series of addressing primitives, separated by ‘,” or ;. $.9 Such address lists are evaluated left-to-right. When addresses are separated by ‘;’ the current line ‘.’ is set to the value of the previous addressing expression before the next address is interpreted. If more addresses are given than the command requires, then all but the last one or two are ignored. If the command takes two addresses, the first addressed line must precede the second in the buffer.t 7. Command descriptions The following form is a prototype for all ex commands: address command ! parameters count flags All parts are optional; the degenerate case is the empty command which prints the next line in the file. For sanity with use from within visual mode, ex ignores a ‘“:” preceding any com- mand. In the following command descriptions, the default addresses are shown in parentheses, which are not, however, part of the command. abbreviate word rhs abbr: ab Add the named abbreviation to the current list. Whenin input mode in visual, if word is typed as a completeword, it will be changed to rhs T The forms ‘.+3’ ‘+3” and ‘+++ are all equivalent; if the current line is line 100 they all address line 103. T The forms\/ and\? scan using the last regular expression used in a scan; after a substitute // and ?? would scan using the substitute’s regular expression. T Null address specifications are permitted in a list of addresses, the defaultin this case is the current line ; thus 100’ is equivalent to ‘.,100°. none. It is an error to give a prefix address to a command which expects 3-88 Ex Reference Manual (.) append abbr: a lext Reads the input text and places it after the specified line. After the command, ‘.’ addresses the last line input or the specified line if no lines were input. If address ‘0’ is given, text is placed at the beginning of the buffer. al text The variant flag to append toggles the setting for the autoindent option during the input of text. args | The members of the argument list are printed, with the current argument delimited by c[a and 4]9. (.,.)change count abbr: ¢ text Replaces the specified lines with the input text. The current line becomes the last line input; if no lines were input it is left as for a delete. c! text The variant toggles autoindent during the change. (.,.)copy addr flags abbr: co A copy of the specified lines is placed after addr, which may be ‘0’. The current line .’ addresses the last line of the copy. The command ¢ is a synonym for copy. (.,.)delete buffer count flags abbr: d Removes the specified lines from the buffer. The line after the last line deleted becomes the current line; if the lines deleted were originally at the end, the new last line becomes the current line. If a named buffer is specified by giving a letter, then the specified lines are saved in that buffer, or appended to it if an upper case letter is used. edit file abbr: e ex file Used to begin an editing session on a new file. The editor first checks to see if the buffer has been modified since the last write command was issued. issued and the command is aborted. If it has been, a warning is The command otherwise deletes the entire contents of the editor buffer, makes the named file the current file and prints the new filename. After insuring that this file is sensiblef the editor reads the file into its buffer. If the read of the file completes without error, the number of lines and characters read is typed. If there were any non-ASCII characters in the file they are stripped of their non- ASCII high bits, and any null characters in the file are discarded. occurred, the file is considered edited. If none of these errors If the last line of the input file is missing the T Le., that it is not a binary file such as a directory, a block or character special file other than /dev/tty, a terminal, or a binary or executable file (as indicated by the first word). Ex Reference Manual 3-89 trailing newline character, it will be supplied and a complaint will be issued. This command leaves the current line ‘.’ at the last line read.t el file The variant form suppresses the complaint about modifications having been made and not written from the editor buffer, thus discarding all changes which have been made before editing the new file. e +n file Causes the editor to begin at line n rather than at the last line; n may also be an editor command containing no spaces, e.g.: “+/pat”. abbr: f file Prints the current file name, whether it has been ‘[Modified]’ since the last write command, whether it is read only, the current line, the number of lines in the buffer, and the percentage of the way through the buffer of the current line.* file file The current file name is changed to file which is considered ‘[Not edited]’. (1,$%) global /pat/ cmds abbr: g First marks each line among those specified which matches the given regular expression. Then the given command list is executed with ‘.’ initially set to each marked line. The command list consists of the remaining commands on the current input line and may continue to multiple lines by ending all but the last such line with a \. If cmds (and possibly the trailing / delimiter) is omitted, each line matching pat is printed. Append, insert, and change commands and associated input are permitted; the ‘.’ terminating input may be omitted if it would be on the last line of the command list. Open and visual commands are permitted in the command list and take input from the terminal. The global command itself may not appear in cmds. The undo command is also not permitted there, as undo instead can be used to reverse the entire global command. The options autoprint and autoindent are inhibited during a global, (and possibly the trailing / delimiter) and the value of the report option is temporarily infinite, in deference to a report for the entire global. Finally, the context mark “” is set to the value of °’ before the global command begins and is not changed during a global command, except perhaps by an open or visual within the global. g! /pat/ cmds abbr: v The variant form of global runs cmds at each line not matching pat. (.)insert abbr: i text Places the given text before the specified line. The current line is left at the last line input; if there were none input it is left at the line before the addressed line. This command differs from append only in the placement of text. 1 If executed from within open or visual, the current line is initially the first line of the file. * Tn the rare case that the current file is ‘{Not edited]’ this is noted also; in this case you have to use the form w! to write to the file, since the editor is not sure that a write will not destroy a file unrelated to the current contents of the buffer. 3-90 Ex Reference Manual i! text The variant toggles autoindent during the insert. (.,.+1) join count flags abbr: j Places the text from a specified range of lines together on one line. White space is adjusted at each junction to provide at least one blank character, two if there was a .” at the end of the line, or none if the first following character is a ‘)’. If there is already white space at the end of the line, then the white space at the start of the next line will be discarded. The variant causes a simpler join with no white space processing; the characters in the lines are simply concatenated. (.) kx The & command is a synonym for mark. It does not require a blank or tab before the following letter. . ) list count flags Prints the specified lines in a more unambiguous way: tabs are printed as I’ and the end of each line is marked with a trailing ‘$’. The current line is left at the last line printed. map lhs rhs The map command is used to define macros for use in visual mode. Lhs should be a single character, or the sequence “#n”, for n a digit, referring to function key n. this character or function key is typed in wvisual corresponding rhs had been typed. “H#n”. When mode, it will be as though the On terminals without function keys, you can type See section 6.9 of the “Introduction to Display Editing with Vi” for more details. (.) mark x Gives the specified line mark x, a single lower case letter. The x must be preceded by a blank or a tab. The addressing form “x’ then addresses this line. The current line is not affected by this command. . ) move addr abbr: m The move command repositions the specified lines to be after addr. The first of the moved lines becomes the current line. next abbr: n The next file from the command line argument list is edited. n! The variant suppresses warnings about the modifications to the buffer not having been written out, discarding (irretrievably) any changes which may have been made. Ex Reference Manual 3-91 n filelist n +command filelist The specified filelist is expanded and the resulting list replaces the current argument list; the first file in the new list is then edited. If command is given (it must contain no spaces), then it is executed after editing the first such file. (.,.)number count flags abbr: # or nu Prints each specified line preceded by its buffer line number. The current line is left at the last line printed. (.) open flags abbr: o (.) open /pat/ flags Enters intraline editing open mode at each addressed line. If pat is given, then the cur- sor will be placed initially at the beginning of the string matched by the pattern. To exit this mode use Q. See An Introduction to Display Editing with Vi for more details. i preserve The current editor buffer is saved as though the system had just crashed. This com- mand is for use only in emergencies when a write command has resulted in an error and you don’t know how to save your work. After a preserve you should seek help. (.,.)print count abbr: p or P Prints the specified lines with non-printing characters printed as control characters ““x’; delete (octal 177) is represented as ‘*?’. The current line is left at the last line printed. (.)put buffer abbr: pu Puts back previously deleted or yanked lines. Normally used with delete to effect movement of lines, or with yank to effect duplication of lines. If no buffer is specified, then the last deleted or yanked text is restored.* By using a named buffer, text may be restored that was saved there at any previous time. quit abbr: q Causes ex to terminate. No automatic write of the editor buffer to a file is performed. However, ex issues a warning message if the file has changed since the last write command was issued, and does not quit.7 Normally, you will wish to save your changes, and you should give a write command; if you wish to discard them, use the ! command variant. q! Quits from the editor, discarding changes to the buffer without complaint. (.) read file abbr: r Places a copy of the text of the given file in the editing buffer after the specified line. no file is given the current file name is used. unless there is none in which case file becomes the current name. trictions for the edit command apply here also. The sensibility res- If the file buffer is empty and there is no current name then ex treats this as an edit command. T Not available in all v2 editors due to memory constraints. * But no modifying commands may intervene between the delete or yank and the put, nor may lines be moved between files without using a named buffer. b P I LLX M alam teciia . o o matioa WL diS0 IssUe d alagiliosiuic € 11 1 ave ava mwaern Liere are maolre Blaa 1ies t i o oo 13 If The current file name is not changed + Yiat . viie drgulineilt ist. 3-92 Ex Reference Manual Address ‘0’ is legal for this command and causes the file to be read at the beginning of the buffer. Statistics are given as for the edit command when the read successfully ter- minates. After a read the current line is the last line read.f (.) read command Reads the output of the command command into the buffer after the specified line. This is not a variant form of the command, rather a read specifying a command rather than a filename; a blank or tab before the ! is mandatory. recover file Recovers file from the system save area. Used after a accidental hangup of the phone** or a system crashTM** or preserve command. Except when you use preserve you will be notified by mail when a file is saved. rewind abbr: rew The argument list is rewound, and the first file in the list is edited. rew! Rewinds the argument list discarding any changes made to the current buffer. set parameter With no arguments, prints those options whose values have been changed from their defaults; with parameter all it prints all of the option values. Giving an option name followed by a ‘?’ causes the current value of that option to be printed. The ‘?’ is unnecessary unless the option is Boolean valued. Boolean options are given values either by the form ‘set option’ to turn them on or ‘set nooption’ to turn them off; string and numeric options are assigned via the form ‘set option=value’. More than one parameter may be given to set ; they are interpreted left-to-right. shell abbr: sh A new shell is created. When it terminates, editing resumes. source file abbr: so Reads and executes commands from the specified file. Source commands may be nested. (.,.) substitute /pat/repl/ options count flags abbr: s On each specified line, the first instance of pattern pat is replaced by replacement pattern repl. If the global indicator option character ‘g’ appears, then all instances are sub- stituted; if the confirm indication character ‘c’ appears, then before each substitution the line to be substituted is typed with the string to be substituted marked with ‘)’ characters. By typing an ‘y’ one can cause the substitution to be performed, any other input causes no change to take place. After a substitute the current line is the last line substi- tuted. Lines may be split by substituting new-line characters into them. must be escaped by preceding it with a ‘\. The newline in repl Other metacharacters available in pat and repl are described below. i Within open and visual the current line is set to the first line read rather than the last. Ex Reference Manual 3-93 stop Suspends the editor, returning control to the top level shell. If autowrite is set and there are unsaved changes, a write is done first unless the form stop! is used. This commands is only available where supported by the teletype driver and operating system. (.,.) substitute options count flags abbr: s If pat and repl are omitted, then the last substitution is repeated. This is a synonym for the & command. (.,.)taddr flags The t command is a synonym for copy. ta tag The focus of editing switches to the location of tag, switching to a different line in the current file where it is defined, or if necessary to another file.f The tags file is normally created by a program such as ctags, and consists of a number of lines with three fields separated by blanks or tabs. The first field gives the name of the tag, the second the name of the file where the tag resides, and the third gives an addressing form which can be used by the editor to find the tag; this field is usually a contextual scan using ‘/pat/’ to be immune to miner changes in the file. Such scans are always performed as if nomagic was set. The tag names in the tags file must be sorted alphabetically. % unabbreviate word abbr: una Delete word from the list of abbreviations. undo abbr: u Reverses the changes made in the buffer by the last buffer editing command. Note that global commands are considered a single command for the purpose of undo (as are open and visual.) Also, the commands write and edit which interact with the file system cannot be undone. Undo is its own inverse. Undo always marks the previous value of the current line ‘.’ as “”. After an undo the current line is the first line restored or the line before the first line deleted if no lines were restored. For commands with more global effect such as global and visual the current line regains it’s pre-command value after an undo. unmap lhs The macro expansion associated by map for lhs is removed. (1,%) v /pat/ ecmds A synonym for the global command variant g!, running the specified cmds on each line which does not match pat. 1 If you have modified the current file before giving a tag command, you must write it out; giving another tag command, specifying no tag will reuse the previous tag. 1 Not available in all v2 editors due to memory constraints. 3-94 Ex Reference Manual version abbr: ve Prints the current version number of the editor as well as the date the editor was last changed. (.) visual type count flags Enters visual mode at the specified line. abbr: vi Type is optional and may be ‘=’ , ‘4’ or .’ as in the 2 command to specify the placement of the specified line on the screen. if type By default, is omitted, the specified line is placed as the first on the screen. A count specifies an initial window size; the default is the value of the option window. See the document An Introduction to Display Editing with Vi for more details. To exit this mode, type Q. visual file visual +n file From visual mode, this command is the same as edit. (1, 8%) write file abbr: w Writes changes made back to file, printing the number of lines and characters written. Normally file is omitted and the text goes back where it came from. If a file is specified, then text will be written to that file.* If the file does not exist it is created. The current file name is changed only if there is no current file name; the current line is never changed. If an error occurs while writing the current and edited file, the editor considers that there has been “No write since last change” even if the buffer had not previously been modified. (1,9%) write>> file abbr: w>> Writes the buffer contents at the end of an existing file. w! name Overrides the checking of the normal write command, and will write to any file which the system permits. (1,%)w lcommand Writes the specified lines into command. Note the difference between w! which overrides checks and w ! which writes to a command. wq name Like a write and then a quit command. wq! name The variant overrides checking on the sensibility of the write command, as w! does. xit name If any changes have been made and not written, writes the buffer out. Then, in any case, quits. * The editor writes to a file only if it is the current file and is edited, if the file does not exist, or if the file is actually a teletype, /dev/tty, /dev/null. Otherwise, you must give the variant form w! to force the write. Ex Reference Manual (.). )yank buffer count 3-95 abbr: ya Places the specified lines in the named buffer, for later retrieval via put. If no buffer name is specified, the lines go to a more volatile place; see the put command description. (.+1) z count Print the next count lines, default window. (.) ztype count Prints a window of text with the specified line at the top. If type is ‘=’ the line is placed at the bottom; a ‘.’ causes the line to be placed in the center.* A count gives the number of lines to be displayed rather than double the number specified by the scroll option. On a CRT the screen is cleared before display begins unless a count which is less than the screen size is given. The current line is left at the last line printed. ! command The remainder of the line after the ‘!’ character is sent to a shell to be executed. Within the text of command the characters ‘%’ and ‘#’ are expanded as in filenames and the character ‘I’ is replaced with the text of the previous command. Thus, in particular, ‘I’ repeats the last such shell escape. If any such expansion is performed, the expanded line ‘will be echoed. The current line is unchanged by this command. If there has been “[No write]” of the buffer contents since the last change to the editing buffer, then a diagnostic will be printed before the command is executed as a warning. A single ‘I’ is printed when the command completes. ( addr , addr ) ! command Takes the specified address range and supplies it as standard input to command; the resulting output then replaces the input lines. ($) Prints the line number of the addressed line. The current line is unchanged. . ) > count flags . ) < count flags Perform intelligent shifting on the specified lines; < shifts left and > shift right. quantity of shift is determined by the shiftwidth The option and the repetition of the specification character. Only white space (blanks and tabs) is shifted; no non-white ~ characters are discarded in a left-shift. The current line becomes the last line which changed due to the shifting. An end-of-file from a terminal input scrolls through the file. The scroll option specifies the size of the scroll, normally a half screen of text. (.+1 ,o+1) . (.+1 , o1 )| | An address alone causes the addressed lines to be printed. A blank line prints the next line in the file. * Forms ‘z=" and ‘zf}’ also exist; ‘z=’ places the current line in the center, surrounds it with lines of ‘—’ characters and leaves the current line at this line. The form ‘z{}’ prints the window before ‘z—’ would. The characters ‘+’, ‘" and ‘=’ may be repeated for cumulative effect. On some v2 editors, no type may be given. 3-96 Ex Reference Manual (.,.) & options count flags Repeats the previous substitute command. (.,.) options count flags Replaces the previous regular expression with the previous replacement pattern from a substitution. 8. Regular expressions and substitute replacement patterns 8.1. Regular expressions A regular expression specifies a set of strings of characters. strings is said to be matched by the regular expression. A member of this set of Ex remembers two previous regular expressions: the previous regular expression used in a substitute command and the previous regular expression used elsewhere (referred to as the previous scanning regular expression.) The previous regular expression can always be referred to by a null re, e.g. ‘//> or ‘77, 8.2. Magic and nomagic The regular expressions allowed by ex are constructed in one of two ways depending on the setting of the magic option. The ex and vr default setting of magic gives quick access to a powerful set of regular expression metacharacters. The disadvantage of magic is that the user must remember that these metacharacters are magic and precede them with the character ‘N to use them as “ordinary” characters. With nomagic, the default for edit, expressions are much simpler, there being only two metacharacters. regular The power of the other metacharacters is still available by preceding the (now) ordinary character with a ‘\. Note that ‘\ is thus always a metacharacter. The remainder of the discussion of regular expressions assumes that that the setting of this option is magic.¥ 8.3. Basic regular expression summary The following basic constructs are used to construct magic mode regular expressions. char An ordinary character matches itself. The characters ‘f}’ at the beginning of a line, ‘$’ at the end of line, ‘*’ as any character other than the first, ., °\, ‘[, and ‘7’ are not ordinary characters and must be escaped (preceded) by ‘\ to be treated as such. f At the beginning of a pattern forces the match to succeed only at the begin- ning of a line. $ At the end of a regular expression forces the match to succeed only at the end of the line. Matches any single character except the new-line character. < Forces the match to occur only at the beginning of a ‘“variable” or “word”; that is, either at the beginning of a line, or just before a letter, digit, or underline and after a character not one of these. > Similar to ‘\<’, but matching the end of a “variable” or “word”, i.e. either the end of the line or before character which is neither a letter, nor a digit, nor the underline character. T To discern what is true with nomagic it suffices to remember that the only special characters in this case will be ‘' at the beginning of a regular expression, ‘$’ at the end of a regular expression, and \, With nomagic the characters < and ‘&’ also lose their special meanings related to the replacement pattern of a cithatitiitn DWUUKJOLVELULUT . Ex Reference Manual 3-97 [string] Matches any (single) character in the class defined by string. Most characters in string define themselves. A pair of characters separated by ‘=’ in string defines the set of characters collating between the specified lower and upper bounds, thus ‘[a—z]’ as a regular expression matches any (single) lower-case letter. If the first character of string is an ‘(' then the construct matches those characters which it otherwise would not; thus ‘[ffa—z]” matches anything but a lower-case letter (and of course a newline). To place any of the characters ‘f’, ‘[’, or ‘=’ in string you must escape them with a preceding “\. 8.4. Combining regular expression primitives The concatenation of two regular expressions matches the leftrrost and then longest string which can be divided with the first piece matching the first regular expression and the second piece matching the second. Any of the (single character matching) regular expressions mentioned above may be followed by the character “*’ to form a regular expression which matches any number of adjacent occurrences (including 0) of characters matched by the regular expression it follows. The character “’ may be used in a regular expression, and matches the text which defined the replacement part of the last substitute command. A regular expression may be enclosed between the sequences ‘N’ and ‘Y’ with side effects in the substitute replacement patterns. 8.5. Substitute replacement patterns The basic metacharacters for the replacement pattern are ‘&’ and “’; these are given as ‘““&’ and ‘N’ when nomagic is set. Each instance of ‘&’ is replaced by the characters which the regular expression matched. The metacharacter “” stands, in the replacement pattern, for the defining text of the previous replacement pattern. Other metasequences possible in the replacement pattern are always introduced by the escaping character ‘\. The sequence ‘¢’ is replaced by the text matched by the n-th regular subexpression enclosed between ‘Y’ and ‘Y .7 The sequences “\u” and ‘N’ cause the immediately following character in the replacement to be converted to upper- or lower-case respectively if this character is a letter. The sequences ‘NJ’ and ‘\L’ turn such conversion on, either until ‘\E’ or “¢’ is encountered, or until the end of the replacement pattern. 9. Option descriptions autoindent, ai default: noai Can be used to ease the preparation of structured program text. At the beginning of each append, change or insert command or when a new line is opened or created by an append, change, insert, or substitute operation within open or visual mode, ex looks at the line being appended after, the first line changed or the line inserted before and calculates the amount of white space at the start of the line. It then aligns the cursor at the level of indentation so determined. If the user then types lines of text in, they will continue to be justified at the displayed indenting level. If more white space is typed at the beginning of a line, the following line will start aligned with the first non-white character of the previous line. the cursor up to the preceding tab stop one can hit "D. are defined at multiples of the shiftwidth option. indent, except by sending an end-of-file with a "D. You cannot backspace over the T When nested, parenthesized subexpressions are present, n is determined by counting occurrences of \( starting from the left. AR R To back The tab stops going backwards 3-98 Ex Reference Manual L 4 -y e Specially processed in this mode is a line with no characters added to it, which turns into a completely blank line (the white space provided for the autoindent is discarded.) Also specially processed in this mode are lines beginning with an ‘¢’ and immediately fol- lowed by a “D. This causes the input to be repositioned at the beginning of the line, but retaining the previous indent for the next line. Similarly, a ‘0’ followed by a "D reposi- tions at the beginning but without retaining the previous indent. Autoindent doesn’t happen in global commands or when the input is not a terminal. autoprint, ap default: ap Causes the current line to be printed after each delete, copy, join, move, substitute, t, undo or shift command. This has the same effect as supplying a trailing ‘p’ to each such command. Autoprint is suppressed in globals, and only applies to the last of many commands on a line. autowrite, aw default: noaw Causes the contents of the buffer to be written to the current file if you have modified it and give a next, rewind, stop, tag, or ! command, or a " (switch files) or *] (tag goto) command in visual. Note, that the edit and ex commands do not autowrite. case, there In each is an equivalent way of switching when autowrite is set to avoid the autowrite (edit for next, rewind! for .I rewind , stop! for stop, tag! for tag, shell for !, and :e # and a :ta! command from within visual). beautify, bf default: nobeautify Causes all control characters except tab, newline and form-feed to be discarded from the input. A complaint is registered the first time a backspace character is discarded. Beautify does not apply to command input. directory, dir default: dir=/tmp Specifies the directory in which ex places its buffer file. If this directory in not writable, then the editor will exit abruptly when it fails to be able to create its buffer there. edcompatible default: noedcompatible Causes the presence of absence of g and ¢ suffixes on substitute commands to be remembered, and to be toggled by repeating the suffices. The suffix r makes the substitution be as in the © command, instead of like &. 1% errorbells, eb default: noeb Error messages are preceded by a bell.* If possible the editor always places the error message in a standout mode of the terminal (such as inverse video) instead of ringing the bell. hardtabs, ht default: ht=8 Gives the boundaries on which terminal hardware tabs are set (or on which the system expands tabs). ignorecase, ic default: noic All upper case characters in the text are mapped to lower case in regular expression matching. In addition, all upper case characters in regular expressions are mapped to lower case except in character class specifications. tf Version 3 only. FORR T » PG § RNV S U Dbell ringing i open I ana visual S SUY USRS T IS I 1. on errors 1S not suppressed Dy seiwting noeo. Ex Reference Manual 3-99 lisp default: nolisp Autoindent indents appropriately for lisp code, and the ( ) { } [[ and ]] commands in open and visual are modified to have meaning for lisp. list default: nolist All printed lines will be displayed (more) unambiguously, showing tabs and end-of-lines as in the list command. magic default: magic for ex and vi¥ If nomagic is set, the number of regular expression metacharacters is greatly reduced, with only ‘)’ and ‘$’ having special effects. In addition the metacharacters ’ and ‘&’ of the replacement pattern are treated as normal characters. All the normal metacharac- ters may be made magic when nomagic is set by preceding them with a “\{ mesg default: mesg Causes write permission to be turned off to the terminal while you are in visual mode, if nomesg is set. 1t number, nu default: nonumber Causes all output lines to be printed with their line numbers. In addition each input line will be prompted for by supplying the line number it will have. open default: open If noopen, the commands open and visual are not permitted. This is set for edit to prevent confusion resulting from accidental entry to open or visual mode. optimize, opt default: optimize Throughput of text is expedited by setting the terminal to not do automatic carriage returns when printing more than one (logical) line of output, greatly speeding output on terminals without addressable cursors when text with leading white space is printed. paragraphs, para default: para=IPLPPPQPP LIbp Specifies the paragraphs for the { and } operations in open and visual. The pairs of characters in the option’s value are the names of the macros which start paragraphs. prompt default: prompt Command mode input is prompted for with a ‘. redraw default: noredraw The editor simulates (using great amounts of output), an intelligent terminal on a dumb terminal (e.g. during insertions in visual the characters to the right of the cursor position are refreshed as each input character is typed.) Useful only at very high speed. remap default: remap If on, macros are repeatedly tried until they are unchanged. ff For example, if o is mapped to O, and O is mapped to I, then if remap is set, o will map to I, but if noremap is set, it will map to O. T Nomagic for edit. 11 Version 3 only. 2 anlu ii Varcion V €rsion o oniy. 3-100 Ex Reference Manual report default: report=>57 Specifies a threshold for feedback from commands. Any command which modifies more than the specified number of lines will provide feedback as to the scope of its changes. For commands such as global, open, undo, and visual which have potentially more far reaching scope, the net change in the number of lines in the buffer is presented at the end of the command, subject to this same threshold. Thus notification is suppressed during a global command on the individual commands performed. scroll default: scroll=12 window Determines the number of logical lines scrolled when an end-of-file is received from a terminal input in command mode, and the number of lines printed by a command mode 2 command (double the value of scroll). sections default: sections=SHNHH HU Specifies the section macros for the [[ and ]] operations in open and visual. The pairs of characters in the options’s value are the names of the macros which start paragraphs. shell, sh default: sh=/bin/sh Gives the path name of the shell forked for the shell escape command ‘I’, and by the shell command. The default is taken from SHELL in the environment, if present. shiftwidth, sw default: sw=8 Gives the width a software tab stop, used in reverse tabbing with "D when using autoindent to append text, and by the shift commands. showmatch, sm default: nosm In open and visual mode, when a ) or } is typed, move the cursor to the matching ( or { for one second if this matching character is on the screen. Extremely useful with lisp. slowopen, slow terminal dependent Affects the display algorithm used in visual mode, holding off display updating during input of new text to improve throughput when the terminal in use is both slow and unin- telligent. See An Introduction to Display Editing with Vi for more details. tabstop, ts default: ts=8 The editor expands tabs in the input file to be on tabstop boundaries for the purposes of display. taglength, tl default: t1=0 Tags are not significant beyond this many characters. A value of zero (the default) means that all characters are significant. tags default: tags=tags /usr/lib/tags A path of files to be used as tag files for the tag command. If A requested tag is searched for in the specified files, sequentially. By default (even in version 2) files called tags are searched for in the current directory and in /usr/lib (a master file for the entire system.) T 2 for edit. oo ofe ¥V ULOAVEL O RJALE J ° Ex Reference Manual 3-101 term from environment TERM The terminal type of the output device. terse default: noterse Shorter error diagnostics are produced for the experienced user. warn default: warn Warn if there has been ‘[No write since last change]’ before a ‘!’ command escape. window | default: window=speed dependent The number of lines in a text window in the visual command. The default is 8 at slow speeds (600 baud or less), 16 at medium speed (1200 baud), and the full screen (minus one line) at higher speeds. w300, w1200, w9600 These are not true options but set window only if the speed is slow (300), medium - (1200), or high (9600), respectively. They are suitable for an EXINIT and make it easy to change the 8/16/full screen rule. wrapscan, ws default: ws Searches using the regular expressions in addressing will wrap around past the end of the file. wrapmargin, wm default: wm=0 Defines a margin for automatic wrapover of text during input in open and visual modes. See An Introduction to Text Editing with Vi for details. writeany, wa default: nowa Inhibit the checks normally made before write commands, allowing a write to any file which the system protection mechanism will allow. 10. Limitations Editor limits that the user is likely to encounter are as follows: 1024 characters per line, 256 characters per global command list, 128 characters per file name, 128 characters in the previous inserted and deleted text in open or visual, 100 characters in a shell escape command, 63 characters in a string valued option, and 30 characters in a tag name, and a limit of 250000 lines in the file is silently enforced. The visual implementation limits the number of macros defined with map to 32, and the total number of characters in macros to be less than 512. Acknowledgments. Chuck Haley contributed greatly to the early development of ex. Bruce Englar encouraged the redesign which led to ex version 1. Bill Joy wrote versions 1 and 2.0 through 2.7, and created the framework that users see in the present editor. Mark Horton added macros and other features and made the editor work on a large number of terminals and Unix svetems s BBEE LR LJJ WVeR/iABAAEN @ CAAELA 3-102 Ex Reference Manual Ex changes — Version 3.1 to 3.5 This update describes the new features and changes which have been made in converting from version 3.1 to 3.5 of ex. Each change is marked with the first version where it appeared. Update to Ex Reference Manual Command line options 3.4 A new command called view has been created. 3.4 The encryption code from the v7 editor is now part of ex. View is just like vi but it sets readonly. You can invoke ex with the —x option and it will ask for a key, as ed. The ed x command (to enter encryption mode from within the editor) is not available. This feature may not be available in all instances of ex due to memory limitations. Commands 3.4 Provisions to handle the new process stopping features of the Berkeley T'T'Y driver have been added. A new command, stop, takes you out of the editor cleanly and efliciently, returning you to the shell. Resuming the editor puts you back in command or visual mode, as appropriate. If autowrite is set and there are outstanding changes, a write is done first unless you say “stop!”. 3.4 A vl <file> command from visual mode is now treated the same as a -edit <file> or :ex <file> command. The meaning of the vi command from ex command mode is not affected. 3.3 A new command mode command xit (abbreviated x) has been added. This is the same as wq but will not bother to write if there have been no changes to the file. Options 3.4 A read only mode now lets you guarantee you won’t clobber your file by accident. You can set the on/off option readonly (ro), and writes will fail unless you use an ! after the write. Commands such as x, ZZ, the autowrite option, and in general anything that writes is affected. This option is turned on if you invoke ex with the —R flag. 3.4 The wrapmargin option is now usable. The way it works has been completely revamped. Now if you go past the margin (even in the middle of a word) the entire word is erased and rewritten on the next line. given to wrapmargin. 0 still means off. This changes the semantics of the number Any other number is still a distance from the right edge of the screen, but this location is now the right edge of the area where wraps can take place, instead of the left edge. Wrapmargin now behaves much like fill/nojustify mode in nroff. 3.3 The options w300, w1200, and w9600 can be set. They are synonyms for window, but only apply at 300, 1200, or 9600 baud, respectively. Thus you can specify you want a 12 line window at 300 baud and a 23 line window at 1200 baud in your EXINIT with :set w300=12 w1200=23 3.3 The new option timeout (default on) causes macros to time out after one second. Turn it off and they will wait forever. This is useful if you want multi character macros, but if your terminal sends escape sequences for arrow keys, it will be necessary to hit escape twice to get a beep. Ex Reference Manual 3.3 3-103 The new option remap (default on) causes the editor to attempt to map the result of a macro mapping again until the mapping fails. This makes it possible, say, to map q to # and #1 to something else and get q1 mapped to something else. Turning it off makes it possible to map "L to 1 and map "R to "L without having "R map to L. 3.3 The new (string) valued option tags allows you to specify a list of tag files, similar to the “path” variable of csh. The files are separated by spaces (which are entered preceded by a backslash) and are searched left to right. which has the same effect as before. The default value is “tags /usr/lib/tags”, It is recommended that ‘“tags” always be the first entry. On Ernie CoVax, /usr/lib/tags contains entries for the system defined library procedures from section 3 of the manual. Environment enquiries 3.4 The editor now adopts the convention that a null string in the environment is the same as not being set. This applies to TERM, TERMCAP, and EXINIT. Vi Tutorial Update Deleted features 3.3 The “q” command from visual no longer works at all. command mode. You must use “Q” to get to ex The “q” command was deleted because of user complaints about hit- ting it by accident too often. 3.5 The provisions for changing the window size with a numeric prefix argument to certain visual commands have been deleted. The correct way to change the window size is to use the z command, for example z5<cr> to change the window to 5 lines. 3.3 The option "mapinput” is dead. It has been replaced by a much more powerful mechanism: “:map!”. Change in default option settings 3.3 The default window sizes have been changed. At 300 baud the window is now 8 lines (it was 1/2 the screen size). At 1200 baud the window is now 16 lines (it was 2/3 the screen size, which was usually also 16 for a typical 24 line CRT). still the full screen size. At 9600 baud the window is Any baud rate less than 1200 behaves like 300, any over 1200 like 9600. This change makes vi more usable on a large screen at slow speeds. Vi commands 3.3 The command “ZZ” from vi is the same as “:x<cr>". leave the editor. 3.4 This is the recommended way to Z must be typed twice to avoid hitting it accidently. The command "Z is the same as “:stop<cr>”. Note that if you have an arrow key that sends “Z the stop function will take priority over the arrow function. If you have your “susp” character set to something besides “Z, that key will be honored as well. 3.3 It is now possible from visual to string several search expressions together separated by semicolons the same as command mode. For example, you can say /foo/;/bar from visual and it will move to the first “bar” after the next “foo”. This also works within one line. 3.3 "R is now the same as "L on terminals where the right arrow key sends "L (This includes the Televideo 912/920 and the ADM 31 terminals.) 3.4 The visual page motlon commands “F and "B now treat any precedlng counts as number of pages to move,instead of changes to the window size. Thatis, 2°F moves forward 2 3-104 Ex Reference Manual Macros 3.3 The “mapinput” mechanism of version 3.1 has been replaced by a more powerful mechanism. An “!” can follow the word “map” in the map command. Map!ed macros only apply during input mode, while map’ed macros only apply during command mode. Using “map” or “map!” by itself produces a listing of macros in the corresponding mode. 3.4 A word abbreviation mode is now available. You can define abbreviations with the abbreviate command | :abbr foo find outer otter which maps “foo” to “find outer otter”. Abbreviations can be turned off with the unabbreviate command. The syntax of these commands is identical to the map and unmap commands, except that the ! forms do not exist. Abbreviations are considered when in visual input mode only, and only affect whole words typed in, using the conservative definition. (Thus “foobar” will not be mapped as it would using “map!”) Abbreviate and unabbreviate can be abbreviated to “ab” and “una”, respectively. Sed 3-105 SED — A Non-Interactive Text Editor Lee E. McMahon Bell Laboratories Murray Hill, New Jersey 07974 Introduction Sed is a non-interactive context editor designed to be especially useful in three cases: 1) To edit files too large for comfortable interactive editing; 2) To edit any size file when the sequence of editing commands is too complicated to be comfortably typed in interactive mode; 3) To perform multiple ‘global’ editing functions efficiently in one pass through the input. Since only a few lines of the input reside in core at one time, and no temporary files are used, the effective size of file that can be edited is limited only by the requirement that the input and output fit simultaneously into available secondary storage. Complicated editing scripts can be created separately and given to sed as a command file. For complex edits, this saves considerable typing, and its attendant errors. Sed running from a command file is much more efficient than any interactive editor known to the author, even if that editor can be driven by a pre-written script. The principal loss of functions compared to an interactive editor are lack of relative addressing (because of the line-at-a-time operation), and lack of immediate verification that a command has done what was intended. Sed is a lineal descendant of the UNIX editor, ed. Because of the differences between interactive and non-interactive operation, considerable changes have been made between ed and sed; even confirmed users of ed will frequently be surprised (and probably chagrined), if they rashly use sed without reading Sections 2 and 3 of this document. The most striking family resemblance between the two editors is in the class of patterns (‘regular expressions’) they recognize; the code for matching patterns is copied almost verbatim from the code for ed, and the description of regular expressions in Section 2 is copied almost verbatim from the UNIX Programmer’s Manual[l]. (Both code and description were written by Dennis M. Ritchie.) 1. Overall Operation Sed by default copies the standard input to the standard output, perhaps performing one or more editing commands on each line before writing it to the output. This behavior may be modified by flags on the command line; see Section 1.1 below. The general format of an editing command is: [address1,address2][function][arguments] One or both addresses may be omitted; the format of addresses is given in Section 2. Any number of blanks or tabs may separate the addresses from the function. The function must be present; the available commands are discussed in Section 3. The arguments may be 3-106 Sed required or optional, according to which function is given; again, they are discussed in Section 3 under each individual function. Tab characters and spaces at the beginning of lines are ignored. 1.1. Command-line Flags Three flags are recognized on the command line: -n: tells sed not to copy all lines, but only those specified by p functions or p flags after s functions (see Section 3.3); -e: tells sed to take the next argument as an editing command; -f: tells sed to take the next argument as a file name; the file should contain editing commands, one to a line. 1.2. Order of Application of Editing Commands Before any editing is done (in fact, before any input file is even opened), all the editing commands are compiled into a form which will be moderately efficient during the execution phase (when the commands are actually applied to lines of the input file). The commands are com- piled in the order in which they are encountered; this is generally the order in which they will be attempted at execution time. The commands are applied one at a time; the input to each command is the output of all preceding commands. The default linear order of application of editing commands can be changed by the flow-ofcontrol commands, ¢t and b (see Section 3). Even when the order of application is changed by these commands, it is still true that the input line to any command is the output of any previously applied command. 1.3. Pattern-space The range of pattern matches is called the pattern space. Ordinarily, the pattern space is one line of the input text, but more than one line can be read into the pattern space by using the N command (Section 3.6.). 1.4. Examples Examples are scattered throughout the text. Except where otherwise noted, the examples all assume the following input text: In Xanadu did Kubla Khan A stately pleasure dome decree: Where Alph, the sacred river, ran Through caverns measureless to man Down to a sunless sea. (In no case is the output of the sed commands to be considered an improvement on Coleridge.) Example: The command 2q will quit after copying the first two lines of the input. In Xanadu did Kubla Khan A stately pleasure dome decree: The output will be: Sed 3-107 2. ADDRESSES: Selecting lines for editing Lines in the input file(s) to which editing commands are to be applied can be selected by addresses. Addresses may be either line numbers or context addresses. The application of a group of commands can be controlled by one address (or address-pair) by grouping the commands with curly braces (‘{ }’)(Sec. 3.6.). 2.1. Line-number Addresses A line number is a decimal integer. As each line is read from the input, a line-number counter is incremented; a line-number address matches (selects) the input line which causes the internal counter to equal the address line-number. The counter runs cumulatively | through multiple input files; it is not reset when a new input file is opened. As a special case, the character $ matches the last line of the last input file. 2.2. Context Addresses A context address is a pattern (‘regular expression’) enclosed in slashes (‘/’). The regular expressions recognized by sed are constructed as follows: 1) An ordinary character (not one of those discussed below) is a regular expression, and matches that character. 2) A circumflex ‘” at the beginning of a regular expression matches the null character at the beginning of a line. 3) A dollar-sign ‘$’ at the end of a regular expression matches the null character at the end of a line. 4) The characters ‘\n’ match an imbedded newline character, but not the newline at the end of the pattern space. 5) A period ‘.’ matches any character except the terminal newline of the pattern space. 6) A regular expression followed by an asterisk “*’ matches any number (including 0) of adjacent occurrences of the regular expression it follows. 7) A string of characters in square brackets ‘[ ]" matches any character in the string, and no others. If, however, the first character of the string is circumflex ‘7, the regular expression matches any character except the characters in the string and the terminal newline of the pattern space. 8) A concatenation of regular expressions is a regular expression which matches the concatenation of strings matched by the components of the regular expression. 9) A regular expression between the sequences ‘\(’ and ‘|)’ is identical in effect to the unadorned regular expression, but has side-effects which are described under the s command below and specification 10) immediately below. 10) The expression ‘\d’ means the same string of characters matched by an expression enclosed in ‘\(’ and ‘\)’ earlier in the same pattern. Here d is a single digit; the string specified is that beginning with the dth occurrence of ‘\(’ counting from the left. For example, the expression “*\(.*)\1’ matches a line beginning with two repeated occurrences of the same string. 11) The null regular expression standing alone (e.g., ‘//’) is equivalent to the last regular expression compiled. | To use one of the special characters (" $ . * [ ]\ /) as a literal (to match an occurrence of itself in the input), precede the special character by a backslash ‘Y. For a context address to ‘match’ the input requires that the whole pattern within the address match some portion of the pattern space. 2.3. Number of Addresses The commands in the next section can have 0, 1, or 2 addresses. maximum number of allowed addresses is given. the maximum allowed is considered an error. Under each command the For a command to have more addresses than 3-108 Sed If a command has no addresses, it is applied to every line in the input. If a command has one address, it is applied to all lines which match that address. If a command has two addresses, it is applied to the first line which matches the first address, and to all subsequent lines until (and including) the first subsequent line which matches the second address. Then an attempt is made on subsequent lines to again match the first address, and the process is repeated. Two addresses are separated by a comma. Examples: /an/ matches lines 1, 3, 4 in our sample text /an.*an/ /"an/ matches line 1 matches no lines /./ /\./ matches all lines matches line 5 /r*an/ matches lines 1,3, 4 (number = zero!) matches line 1 /\(an\).*\1/ 3. FUNCTIONS All functions are named by a single character. In the following summary, the maximum number of allowable addresses is given enclosed in parentheses, then the single character func- tion name, possible arguments enclosed in angles (< >), an expanded English translation of the single-character name, and finally a description of what each function does. The angles around the arguments are not part of the argument, and should not be typed in actual editing commands. 3.1. Whole-line Oriented Functions (2)d -- delete lines The d function deletes from the file (does not write to the output) all those lines matched by its address(es). It also has the side effect that no further commands are attempted on the corpse of a deleted line; as soon as the d function is executed, a new line is read from the input, and the list of editing commands is re-started from the beginning on the new line. (2)n -- next line The n function reads the next line from the input, replacing the current line. The current line is written to the output if it should be. The list of editing commands is continued following the n command. (1)a\ <text> -- append lines The a function causes the argument <text> to be written to the output after the line matched by its address. The a command is inherently multi-line; a must appear at the end of a line, and <text> may contain any number of lines. To preserve the one-command-to-a-line fiction, the interior newlines must be hidden by a backslash character (‘\') immediately preceding the newline. The <text> argument is terminated by the first unhidden newline (the first one not immediately preceded by backslash). Once an a function is successfully executed, <text> will be written to the output regardless of what later commands do to the line which triggered it. The triggering line may be deleted entirely; <text> will still be written to the Sed 3-109 output. The <text> is not scanned for address matches, and no editing commands are attempted on it. It does not cause any change in the line-number counter. (Di) <text> -- insert lines The i function behaves identically to the a function, except that <text> is written to the output before the matched line. All other comments about the a function apply to the i function as well. (2)c\ <text> -- change lines The ¢ function deletes the lines selected by its address(es), and replaces them with the lines in <text>. Like a and i, ¢ must be followed by a newline hidden by a backslash; and interior new lines in <text> must be hidden by backslashes. The ¢ command may have two addresses, and therefore select a range of lines. If it does, all the lines in the range are deleted, but only one copy of <text> is written to the output, not one copy per line deleted. As with a and i, <text> is not scanned for address matches, and no editing commands are attempted on it. It does not change the line-number counter. After a line has been deleted by a c¢ function, no further commands are attempted on the corpse. If text is appended after a line by a or r functions, and the line is subsequently changed, the text inserted by the ¢ function will be placed before the text of the a or r functions. (The r function is described in Section 3.4.) Note: Within the text put in the output by these functions, leading blanks and tabs will disap- To get leading blanks and tabs into the output, precede the first desired blank or tab by a backslash; the backslash will not appear in the output. pear, as always in sed commands. Example: The list of editing commands: a\ XXXX d applied to our standard input, produces: In Xanadu did Kubhla Khan XXXX Where Alph, the sacred river, ran XXXX Down to a sunless sea. In this particular case, the same effect would be produced by either of the two following command lists: i\ XXXX d c\ XXXX 3-110 Sed 3.2. Substitute Function One very important function changes parts of lines selected line. by a context search within the (2)s<pattern><replacement> <flags> -- substitute The s function replaces part of a line (selected ment>. It can best be read: by <pattern>) with <replace- Substitute for <pattern>, <replacement> The <pattern> argument contains a pattern, exactly like the patterns in addresses (see 2.2 above). The only difference between <pattern> and a con- text address is that the context address must be delimited by slash (‘/’) char- acters; <pattern> may be delimited by any character other than line. space or new- By default, only the first string matched by <patter n> is replaced, but see the g flag below. The <replacement> argument begins immediately character of <pattern>, and must instance of the delimiting character. of the delimiting character.) be followed after the second delimiting immediately by another (Thus there are exactly three instances | The <replacement> is not a pattern, and the charact ers which are special patterns do not have special meaning in <replac ement>. acters are special: & in Instead, other char- is replaced by the string matched by <pattern> \d (where d is a single digit) is replaced by the dth substri by parts of <pattern> enclosed in ‘\C and ‘\)’. ng matched If nested sub- strings occur in <pattern>, the dth is determined by counting opening delimiters (‘\(’). As in patterns, special characters may be preceding them with backslash “‘\). The <flags> argument may contain the following made literal by flags: g -- substitute <replacement> for all (non-ov erlapping) instances of <pattern> in the line. After a successful substitution, the scan for the next instance of <pattern> begins just after the end of the inserted characters; characters put into the line from <replacement> are not rescanned. p -- print the line if a successful replacement was done. The p flag causes the line to be written to the output if and only if a sub- stitution was actually made by the s function. Notice that if several s functions, each followed by a p flag, successf ully substitute in the same input line, multiple copies of the line will be written to the output: one for each successful substitution. w <filename> -- write the line to a file if a successf ul replace done. ment was The w flag causes lines which are actually substit uted by the s function to be written to a file named by <filename>. If <filename> exists before sed is run, it is overwrit ten; if not, it is created. A single space must separate w and <filename> Sed 3-111 The possibilities of multiple, somewhat different copies of one input line being written are the same as for p. A maximum of 10 different file names may be mentioned after w flags and w functions (see below), combined. Examples: The following command, applied to our standard input, s/to/by/w changes produces, on the standard output: In Xanadu did Kubhla Khan A stately pleasure dome decree: Where Alph, the sacred river, ran Through caverns measureless by man Down by a sunless sea. and, on the file ‘changes’: Through caverns measureless by man Down by a sunless sea. If the nocopy option is in effect, the command: s/[.,;2:1/*P&*/gp produces: A stately pleasure dome decree*P:* Where Alph*P,* the sacred river*P,* ran Down to a sunless sea*P.* Finally, to illustrate the effect of the g flag, the command: /X/s/an/AN/p produces (assuming nocopy mode): In XANadu did Kubhla Khan and the command: /X/s/an/AN/gp produces: In XANadu did Kubhla KhAN 3.3. Input-output Functions (2)p -- print The print function writes the addressed lines to the standard output file. They are written at the time the p function is encountered, regardless of what succeeding editing commands may do to the lines. (2)w <filename> -- write on <filename> The write function writes the addressed lines to the file named by <filename>. If the file previously existed, it is overwritten; if not, it is created. The lines are written exactly as they exist when the write function is encountered for each line, regardless of what subsequent editing commands may do to them. Exactly one space must separate the w and <filename>. A maximum of ten different files may be mentioned in write functions and w flags after s functions, combined. 3-112 Sed (I)r <filename> -- read the contents of a file The read function reads the contents of <filename>, and appends them after the line matched by the address. The file is read and appended regardless of what subsequent editing commands do to the line which matched its address. If r and a functions are executed on the same line, the text from the a func- tions and the r functions is written to the output in the order that the functions are executed. Exactly one space must separate the r and <filename>. If a file mentioned by a r function cannot be opened, it is considered a null file, not an error, and no diagnostic is given. NOTE: Since there is a limit to the number of files that can be opened simultaneously, care should be taken that no more than ten files be mentioned in w functions or flags; that number is reduced by one if any r functions are present. (Only one read file is open at one time.) Examples Assume that the file ‘notel’ has the following contents: Note: ~ Kubla Khan (more properly Kublai Khan; 1216-1294) was the grandson and most eminent successor of Genghiz (Chingiz) Khan, and founder of the Mongol dynasty in China. Then the following command: /Kubla/r notel produces: In Xanadu did Kubla Khan Note: Kubla Khan (more properly Kublai Khan; 1216-1294) was the grandson and most eminent successor of Genghiz (Chingiz) Khan, and founder of the Mongol dynasty in China. A stately pleasure dome decree: Where Alph, the sacred river, ran Through caverns measureless to man Down to a sunless sea. 3.4. Multiple Input-line Functions Three functions, all spelled with capital letters, deal specially with pattern spaces containing imbedded newlines; they are intended principally to provide pattern matches across lines in the input. (2)N -- Next line The next input line is appended to the current line in the pattern space; the two input lines are separated by an imbedded newline. Pattern matches may extend across the imbedded newline(s). (2)D -- Delete first part of the pattern space Delete up to and including the first newline character in the current pattern space. If the pattern space becomes empty (the only newline was the terminal newline), read another line from the input. In any case, begin the list of edit- ing commands again from its beginning. (2)P -- Print first part of the pattern space Print up to and including the first newline in the pattern space. The P and D functions are equivalent to their lower-case counterparts if there are no Sed 3-113 imbedded newlines in the pattern space. 3.5. Hold and Get Functions Four functions save and retrieve part of the input for possible later use. (2)h -- hold pattern space The h functions copies the contents of the pattern space into a hold area (destroying the previous contents of the hold area). (2)H -- Hold pattern space The H function appends the contents of the pattern space to the contents of the hold area; the former and new contents are separated by a newline. (2)g -- get contents of hold area The g function copies the contents of the hold area into the pattern space (destroying the previous contents of the pattern space). (2)G -- Get contents of hold area The G function appends the contents of the hold area to the contents of the pattern space; the former and new contents are separated by a newline. (2)x -- exchange The exchange command interchanges the contents of the pattern space and the hold area. Example The commands 1h 1s/ did.*// 1x G s/\n/ :/ applied to our standard example, produce: In Xanadu did Kubla Khan :In Xanadu A stately pleasure dome decree: :In Xanadu - Where Alph, the sacred river, ran :In Xanadu Through caverns measureless to man :In Xanadu Down to a sunless sea. :In Xanadu 3.6. Flow-of-Control Functions These functions do no editing on the input lines, but control the application of functions to the lines selected by the address part. The Don’t command causes the next command (written on the same line), to be applied to all and only those input lines not selected by the adress part. Fay \¥4 o - fomd & v Yoo may appear on the sam <t Ti’\ ¢») 'S ad ‘1“: nnnnnnnn fi 51 OUPIIIS CULLLIIIAL11,. o w2 > causes the nex The grouping command not applied) as a block t = (2){ -- Grouping 3-114 Sed The group of commands is terminated by a matching ‘}’ standing on a line by itself. Groups can be nested. (0):<label> -- place a label The label function marks a place in the list of editing commands which may be referred to by b and ¢ functions. The <label> may be any sequence of eight or fewer characters; if two different colon functions have identical labels, a compile time diagnostic will be generated, and no execution attempted. (2)b<label> -- branch to label The branch function causes the sequence of editing commands being applied to the current input line to be restarted immediately after the place where a colon function with the same <label> was encountered. If no colon function with the same label can be found after all the editing commands have been compiled, a compile time diagnostic is produced, and no execution is attempted. A b function with no <label> is taken to be a branch to the end of the list of editing commands; whatever should be done with the current input line is done, and another input line is read; the list of editing commands is restarted from the beginning on the new line. (2)t<label> -- test substitutions The t function tests whether any successful substitutions have been made on the current input line; if so, it branches to <label>; if not, it does nothing. The flag which indicates that a successful substitution has been executed is reset by: 1) reading a new input line, or 2) executing a t function. 3.7. Miscellaneous Functions (1)= -- equals The = function writes to the standard output the line number of the line matched by its address. (1)q -- quit The ¢ function causes the current line to be written to the output (if it should be), any appended or read text to be written, and execution to be terminated. Reference [1] Ken Thompson and Dennis M. Ritchie, The UNIX Programmer’s Manual. tories, 1978. Bell Labora- Introduction 4-1 PART 4: COMMAND INTERPRETERS A shell is a command interpreter, an interface between a user and the operating system. The ULTRIX-32 system provides two shells: the Bourne Shell (the UNIX System 7 shell) and the C Shell (the Berkeley shell). Each shell allows users to communicate with the ULTRIX-32 system to call editors, compilers, and other utilities, and to manipulate files. Figure 1-1 shows how the shells relate to the ULTRIX-32 system utilities. Program Development Tools File Manipulation Tools Communication Tools System Administration Tools Text Formatters Compilers Editors Mail Bourne Shell . C Shell Utilities Figure 1-1 Shells in the ULTRIX-32 System = When you use a shell interactively, it serves as a command language; when you write and execute a sequence of shell commands, the shell serves as a programming language. Both shells offer features for flow control, parameter substitution, shell variables, fault trapping, and debugging. The Bourne Shell was written first. The C Shell was developed to provide additional interactive features. It is called the C Shell because its command language, syntax, and 4-2 Introduction control flow are similar to the C programming language. The two shells are, in general, not compatible; programs written for the Bourne Shell will not run on the C Shell without altera- tion. You can set up your login file to permanently establish one of these shells as your default shell. | This part includes an article describing each shell. If you choose to use the C Shell, you will find both articles useful. If you use the Bourne Shell, skip the “Introduction to the C Shell.” The first article, “An Introduction to the UNIX Shell,” by S. R. Bourne, explains the Bourne Shell concepts, commands, and command formats, and it demonstrates all major features with examples and explanations. The two appendixes at the end of the article make a handy reference: “Grammar” and ‘“Metacharacters and Reserved Words.” The “Introduction to the C Shell,” by William Joy, is more expansive in its examples and explanations than the Bourne article, and it concentrates more on interactive use of the shell. The article documents all features unique to the C Shell, including history, aliases, argument expansion, C language-type arithmetic operations, and job control. A handy glossary at the end of the article defines C Shell commands and concepts. As you read these articles, refer to the ULTRIX-32 Programmers Manual, Binder 1. It gives detailed specifications for each command. The shell articles in this volume provide a background for those specifications. Bourne and Joy show how to coordinate the commands to produce useful results. An Introduction to the UNIX Shell 4-3 An Introduction to the UNIX Shell S. R. Bourne Bell Laboratories Murray Hill, New Jersey 07974 1.0 Introduction The shell is both a command language and a programming language that provides an interface to the UNIX operating system. This memorandum describes, with examples, the UNIX shell. The first section covers most of the everyday requirements of terminal users. Some familiarity with UNIX is an advantage when reading this section; see, for example, "UNIX for beginners”.! Section 2 describes those features of the shell primarily intended for use within shell procedures. These include the control-flow primitives and string-valued variables provided by the shell. A knowledge of a programming language would be a help when reading this section. The last section describes the more advanced features of the shell. References of the form ”see pipe (2)” are to a section of the UNIX manual.? 1.1 Simple commands Simple commands consist of one or more words separated by blanks. The first word is the name of the command to be executed; any remaining words are passed as arguments to the command. For example, who is a command that prints the names of users logged in. The command Is -1 prints a list of files in the current directory. The argument - tells Is to print status informa- tion, size and the creation date for each file. 1.2 Background commands To execute a command the shell normally creates a new process and waits for it to finish. A command may be run without waiting for it to finish. For example, cc pgm.c & calls the C compiler to compile the file pgm.c. The trailing & is an operator that instructs the shell not to wait for the command to finish. To help keep track of such a process the shell reports its process number following its creation. A list of currently active processes may be obtained using th e ps command. As & &L WA 1.3 Input output redirection Most commands produce output on the standard output that is initially connected to the terminal. This output may be sent to a file by writing, for example, UNIX is a Trademark of Bell Laboratories 4-4 An Introduction to the UNIX Shell Is -1 >file The notation >file is interpreted by the shell and is not passed as an argument to Is. If file does not exist then the shell creates it; otherwise the original contents of file are replaced with the output from Ils. Output may be appended to a file using the notation Is -1 >>file In this case file is also created if it does not already exist. The standard input of a command may be taken from a file instead of the terminal by writing, for example, wc <file The command wc reads its standard input (in this case redirected from file) and prints the number of characters, words and lines found. If only the number of lines is required then we—l <file could be used. 1.4 Pipelines and filters The standard output of one command may be connected to the standard input of another by writing the ‘pipe’ operator, indicated by |, as in, Is-1 | we Two commands connected in this way constitute a pipeline and the overall effect is the same as | Is 4 >file; we <file except that no file is used. Instead the two processes are connected by a pipe (see pipe (2)) and are run in parallel. Pipes are unidirectional and synchronizationis achieved by halting wc when thereis nothing to read and halting Is when the pipe is full. A filteris a command that reads its standard input, transforms it in some way, and prints the result as output. specified string. One such filter, grep, selects from its input those lines that contain some For example, Is | grep old prints those lines, if any, of the output from Is that contain the string old. filteris sort. Another useful For example, who | sort will print an alphabetically sorted list of logged in users. A pipeline may consist of more than two commands, for example, Is | grep old | wcprints the number of file names in the current directory containing the string old. 1.5 File name generation Many commands accept arguments which are file names. For example, ls -1 main.c U Il provides a mechanism for generating a list of file names that match a pattern. (D - e [ oD prints information relating to the file main.c. > n‘vnm“ exampie, For An Introduction to the UNIX Shell 4-5 Is -l *.c generates, as arguments to [s, all file names in the current directory that end in .c. acter * is a pattern that will match any string including the null string. The char- In general patterns are specified as follows. s Matches any string of characters including the null string. ? Matches any single character. [...] Matches any one of the characters enclosed. A pair of characters separated by a minus will match any character lexically between the pair. For example, [a—z]* matches all names in the current directory beginning with one of the letters a through 2. /usr/fred/test/? | matches all names in the directory /usr/fred/test that consist of a single character. If no file name is found that matches the pattern then the pattern is passed, unchanged, as an argument. This mechanism is useful both to save typing and to select names according to some pattern. It may also be used to find files. For example, echo /usr/fred/#*/core finds and prints the names of all core files in sub-directories of /usr/fred. (echo is a standard UNIX command that prints its arguments, separated by blanks.) This last feature can be expensive, requiring a scan of all sub-directories of /usr/fred. There is one exception to the general rules given for patterns. The character ‘.’ at the start of a file name must be explicitly matched. echo * will therefore echo all file names in the current directory not beginning with ‘.’. echo .* will echo all those file names that begin with ‘. This avoids inadvertent matching of the names ‘.’ and ‘..” which mean ‘the current directory’ and ‘the parent directory’ respectively. (Notice that [s suppresses information for the files ‘.’ and ‘..’.) 1.6 Quoting Characters that have a special meaning to the shell, such as < > % ? | &, are called metacharacters. A complete list of metacharacters is given in appendix B. Any character preceded by a \is quoted and loses its special meaning, if any. The \is elided so that echo \? will echo a single ?, and echo \\ will echo a single \. To allow long strings to be continued over more than one line the sequence newline is ignored. \ is convenient for quoting single characters. When more than one character needs quoting the above mechanism is clumsy and error prone. enclosing the string between single quotes. b o e e e T CCIIU XA T+ 7T+ XA A string of characters may be quoted by For example, 4-6 An Introduction to the UNIX Shell will echo XXk¥kkXYX The quoted string may not contain a single quote but may contain newlines, which are preserved. This quoting mechanism is the most simple and is recommended for casual use. A third quoting mechanism using double quotes is also available that prevents interpretation of some but not all metacharacters. Discussion of the details is deferred to section 3.4. 1.7 Prompting | When the shell is used from a terminal it will issue a prompt before reading a command. By default this prompt is ‘¢ ’. It may be changed by saying, for example, PS1=yesdear that sets the prompt to be the string yesdear. If a newline is typed and further input is needed then the shell will issue the prompt ‘> ’. Sometimes this can be caused by mistyping a quote mark. If it is unexpected then an interrupt (DEL) will return the shell to read another command. This prompt may be changed by saying, for example, PS2=more 1.8 The shell and login Following login (1) the shell is called to read and execute commands typed at the terminal. If the user’s login directory contains the file .profile then it is assumed to contain commands and is read by the shell before reading any commands from the terminal. 1.9 Summary ® Is Print the names of files in the current directory. » Is >file Put the output from [s into file. . 1s | wel Print the number of files in the current directory. ® Is | grep old Print those file names containing the string old. ° Is | grep old | wed Print the number of files whose name contains the string old. ° cc pgm.c & Run cc in the background. An Introduction to the UNIX Shell 4-7 2.0 Shell procedures The shell may be used to read and execute commands contained in a file. For example, sh file [ args ... ] calls the shell to read commands from file. Such a file is called a command procedure or shell procedure. Arguments may be supplied with the call and are referred to in file using the positional parameters $1, $2, .... For example, if the file wg contains who | grep $1 then sh wg fred is equivalent to who | grep fred UNIX files have three independent attributes, read, write and execute. The UNIX command chmod (1) may be used to make a file executable. For example, chmod +x wg will ensure that the file wg has execute status. Following this, the command wg fred is equivalent to sh wg fred This allows shell procedures and programs to be used interchangeably. In either case a new process is created to run the command. As well as providing names for the positional parameters, the number of positional parameters in the call is available as $#. The name of the file being executed is available as $0. A special shell parameter $* is used to substitute for all positional parameters except $0. A typical use of this is to provide some default arguments, as in, nroff <T'450 —ms $* which simply prepends some arguments to those already given. 2.1 Control flow - for A frequent use of shell procedures is to loop through the arguments ($1, $2, ...) executing commands once for each argument. An example of such a procedure is tel that searches the file /usr/lib/telnos that contains lines of the form fred mh0123 bert mh0789 for i do grep $i /usr/lib/telnos; done The command tel fred prints those lines in /usr/lib/telnos that contain the string fred. 4-8 An Introduction to the UNIX Shell tel fred bert prints those lines containing fred followed by those for bert. The for loop notation is recognized by the shell and has the general form for name in wl w2 ... do command-list done A command-list is a sequence of one or more simple commands separated or terminated by a newline or semicolon. Furthermore, reserved words like do and done are only recognized following a newline or semicolon. name is a shell variable that is set to the words wl w2 ... in turn each time the command-list following do is executed. If in wl w2 ... is omitted then the loop is executed once for each positional parameter; that is, in $* is assumed. | Another example of the use of the for loop is the create command whose text is for i do >3%i; done The command create alpha beta ensures that two empty files alpha and beta exist and are empty. The notation >file may be used on its own to create or clear the contents of a file. Notice also that a semicolon (or newline) is required before done. 2.2 Control flow - case A multiple way branch is provided for by the case notation. For example, case $# in 1) cat >>¥%1 ;; 2) cat >>9$2 <$1 ;; %) echo “usage: append [ from ] to” ;; esac is an append command. When called with one argument as append file $# is the string I and the standard input is copied onto the end of file using the cat command. append filel file2 appends the contents of filel onto file2. If the number of arguments supplied to append is other than 1 or 2 then a message is printed indicating proper usage. | The general form of the case command is case word in pattern) command-list;; esac The shell attempts to match word with each pattern, in the order in which the patterns appear. If a match is found the associated command-list is executed and execution of the case is complete. Since * is the pattern that matches any string it can be used for the default case. A word of caution: no check is made to ensure that only one pattern matches the case argument. The first match found defines the set of commands to be executed. In the example below the commands following the second * will never be executed. An Introduction to the UNIX Shell 4-9 case $# In *) weu 3 %) vee )y esac Another example of the use of the case construction is to distinguish between different forms of an argument. The following example is a fragment of a cc command. for 1 do case $1 in Jocs]) ... _x) echo “unknown flag $i" ;; x.c) /lib/cO $i...;; *)echo “unexpected argument $i” ;; esac done To allow the same commands to be associated with more than one pattern the case command provides for alternative patterns separated by a |. For example, case $1 in x|-y)... esac is equivalent to case $i in —[xy]) ... esac The usual quoting conventions apply so that case %1 in \?) will match the character ?. 2.3 Here documents The shell procedure tel in section 2.1 uses the file /usr/lib/telnos to supply the data for grep. An alternative is to include this data within the shell procedure as a here document, as in, for 1 | do grep $i <<! fred mh0123 bert mh0789 | done In this example the shell takes the lines between <<! and ! as the standard input for grep. The string ! is arbitrary, the document being terminated by a line that consists of the string following <<. Parameters are substituted in the document before it is made available to grep as illustrated by the following procedure called edg. 4-10 An Introduction to the UNIX Shell ed $3 <<% g/$1/s//$2/g w % The call edg stringl string?2 file is then equivalent to the command ed file <<% g/stringl/s//string2/g w % and changes all occurrences of stringl in file to string2. Substitution can be prevented using \ to quote the special character $ as in ed $3 <<+ 1\8s/$1/$2/g w + (This version of edg is equivalent to the first except that ed will print a ? if there are no occurrences of the string $1.) Substitution within a here document may be prevented entirely by quoting the terminating string, for example, grep $i <<\# H The document is presented without modification to grep. If parameter substitution is not required in a here document this latter form is more efficient. 2.4 Shell variables The shell provides string-valued variables. Variable names bégin with a letter and consist of letters, digits and underscores. Variables may be given values by writing, for example, user=fred box=m000 acct=mh0000 which assigns values to the variables user, box and acct. A variable may be set to the null string by saying, for example, null= The value of a variable is substituted by preceding its name with $; for example, echo $user will echo fred. Variables may be used interactively to provide abbreviations for frequently used strings. faX'ds N a' e &Y For 1o CaAdiiipic, b=/usr/fred/bin mv pgm 3$b will move the file pgm from the current directory to the directory /usr/fred/bin. general notation is available for parameter (or variable) substitution, as in, which is equivalent to A more An Introduction to the UNIX Shell 4-11 echo $user and is used when the parameter name is followed by a letter or digit. For example, tmp=/tmp/ps ps a >${tmp}a will direct the output of ps to the file /tmp/psa, whereas, ps a >$tmpa would cause the value of the variable tmpa to be substituted. Except for $? the following are set initially by the shell. $? is set after executing each com- mand. $? The exit status (return code) of the last command executed as a decimal string. Most commands return a zero exit status if they complete successfully, other- wise a non-zero exit status is returned. Testing the value of return codes is dealt with later under if and while commands. $# The number of positional parameters (in decimal). $$ The process number of this shell (in decimal). Used, for example, in the append command to check the number of parameters. Since process numbers are unique among all existing processes, this string is frequently used to generate unique temporary file names. For example, ps a >/tmp/ps$$ rm /tmp/ps$$ $! The process number of the last process run in the background (in decimal). $— The current shell flags, such as -x and -v. Some variables have a special meaning to the shell and should be avoided for general use. S$MAIL When used interactively the shell looks at the file specified by this variable before it issues a prompt. If the specified file has been modified since it was last looked at the shell prints the message you have mail before prompting for the next command. This variable is typically set in the file .profile, in the user’s login directory. For example, MAIL=/usr/mail/fred $HOME The default argument for the cd command. The current directory is used to resolve file name references that do not begin with a /, and is changed using the cd command. For example, cd /usr/fred/bin makes the current directory /usr/fred/bin. cat wn will print on the terminal the file wn in this directory. The command cd with no argument is equivalent to cd SHOME This variable is also typically set in the the user’s login profile. $PATH A list of directories that contain commands (the search path). Each time a command is executed by the shell a list of directories is searched for an execut- 4-12 An Introduction to the UNIX Shell able file. are If $PATH is not set then the current directory, /bin, and /usr/bin searched by default. Otherwise $PATH consists of directory names separated by :. For example, PATH=:/usr/fred/bin:/bin:/usr/bin specifies that the current directory (the null string before /usr/fred/bin, /bin and /usr/bin are to be searched in that order. the first :), In this way individual users can have their own ‘private’ commands that are accessible independently of the current directory. If the command name contains a / then this directory search is not used; a single attempt is made to execute the command. $PS1 The primary shell prompt string, by default, ‘$ . $PS2 The shell prompt when further input is needed, by default, ‘> . $IFS The set of characters used by blank interpretation (see section 3.4). 2.5 The test command The test command, although not part of the shell, is intended for use by shell programs. For example, test f file returns zero exit status if file exists and non-zero exit status otherwise. ates a predicate and returns the result as its exit status. In general test evalu- Some of the more frequently used test arguments are given here, see test (1) for a complete specification. test s true if the argument s is not the null string test —f file true if file exists test —r file true if file is readable test —-w file true if file is writable test—d file true if file is a directory 2.6 Control flow - while The actions of the for loop and the case branch are determined by data available to the shell. A while or until loop and an if then else branch are also provided whose actions are determined by the exit status returned by commands. A while loop has the general form while command-list, do command-list, done The value tested by the while command is the exit status of the last simple command follow- ing while. Each time round the loop command-list, is executed; if a zero exit status is returned then command-list, is executed; otherwise, the loop terminates. For example, while test $1 do ... shift P I Uuoiice is equivalent to for 1 do . done shift is a shell command that renames the positional parameters $2, $3, ... as $1, $2, ... and loses $1. An Introduction to the UNIX Shell 4-13 Another kind of use for the while/until loop is to wait until some external event occurs and then run some commands. In an until loop the termination condition is reversed. ple’ For exam- ‘ until test f file do sleep 300; done commands will loop until file exists. Each time round the loop it waits for 5 minutes before trying again. (Presumably another process will eventually create the file.) 2.7 Control flow - if Also available is a general conditional branch of the form, if command-list then command-list else command-list fi that tests the value returned by the last simple command following if. The if command may be used in conjunction with the test command to test for the existence of a file as in if test -f file then process file else do something else fi An example of the use of if, case and for constructions is given in section 2.10. A multiple test if command of the form if ... ‘then ... else if ... then ... else if... fi fi may be written using an extension of the if notation as, if ... then ... elif then ... elif fi The following example is the touch command which changes the ‘last modified’ time for a list of files. The command may be used in conjunction with make (1) to force recompilation of a list of files. 4-14 An Introduction to the UNIX Shell flag=" for 1 do case $i in ~C) flag=N ;; *)1f test -f $1 then In $i junk$$; rm junk$$ elif test $flag then echo file \’$i\' does not exist else >$i1 fi esac done The -¢ flag is used in this command to force subsequent files to be created if they do not already exist. Otherwise, if the file does not exist, an error message is printed. The shell variable flag is set to some non-null string if the —¢ argument is encountered. The commands In..;rm... make a link to the file and then remove it thus causing the last modified date to be updated. The sequence if command1 then command2 fi may be written commandl && command?2 Conversely, commandl || command? executes command?2 only if commandl fails. In each case the value returned is that of the last simple command executed. | 2.8 Command grouping Commands may be grouped in two ways, { command-list ; } and ( command-list ) In the first command-list is simply executed. - separate process. The second form executes command-list as a For example, (cd x; rm junk ) executes rm junk in the directory x without changing the current directory of the invoking shell WIALAN KK, The commands cd x; rm junk have the same effect but leave the invoking shell in the directory x. An Introduction to the UNIX Shell 4-15 2.9 Debugging shell procedures The shell provides two tracing mechanisms to help when debugging shell procedures. The first is invoked within the procedure as set —v (v for verbose) and causes lines of the procedure to be printed as they are read. It is useful to help isolate syntax errors. It may be invoked without modifying the procedure by saying sh —v proc ... where proc is the name of the shell procedure. This flag may be used in conjunction with the -n flag which prevents execution of subsequent commands. (Note that saying set -n at a terminal will render the terminal useless until an end-of-file is typed.) The command set —x will produce an execution trace. Following parameter substitution each command is printed as it is executed. (Try these at the terminal to see what effect they have.) Both flags may be turned off by saying set — and the current setting of the shell flags is available as $-. 2.10 The man command The following is the man command which is used to print sections of the UNIX manual. It is called, for example, as man sh man -t ed man 2 fork In the first the manual section for sh is printed. Since no section is specified, section 1 is used. The second example will typeset (-t option) the manual section for ed. The last prints the fork manual page from section 2. 4-16 An Introduction to the UNIX Shell cd /usr/man : “‘colon is the comment command’ : “"default is nroff ($N), section 1 ($s)’ N=n s=1 for 1 do case $1 In [1-9]*) s=$1 ;; -t) N=t ;; 1) N=n ;; _x) echo unknown flag \'$i\ ;; x)if test - man$s/$i.$s then ${N}roff man0/${N}aa man$s/$i.$s else : ‘look through all manual sections’ found=no foryin123456789 do if test f man$j/$i.$; then man §j $i found=yes fi done case $found in no) echo “$i: manual page not found’ esac esac done Figure 1. A version of the man command An Introduction to the UNIX Shell 4-17 3.0 Keyword parameters Shell variables may be given values by assignment or when a shell procedure is invoked. An argument to a shell procedure of the form name=value that precedes the command name causes value to be assigned to name before execution of the procedure begins. name in the invoking shell is not affected. The value of For example, user=fred command will execute command with user set to fred. The -k flag causes arguments of the form name=value to be interpreted in this way anywhere in the argument list. sometimes called keyword parameters. Such names are If any arguments remain they are available as posi- tional parameters $1, $2, .... The set command may also be used to set positional parameters from within a procedure. For example, set — * will set $1 to the first file name in the current directory, $2 to the next, and so on. Note that the first argument, —, ensures correct treatment when the first file name begins with a — 3.1 Parameter transmission When a shell procedure is invoked both positional and keyword parameters may be supplied with the call. Keyword parameters are also made available implicitly to a shell procedure by specifying in advance that such parameters are to be exported. For example, export user box marks the variables user and box for export. When a shell procedure is invoked copies are made of all exportable variables for use within the invoked procedure. Modification of such variables within the procedure does not affect the values in the invoking shell. It is generally true of a shell procedure that it may not modify the state of its caller without explicit request on the part of the caller. (Shared file descriptors are an exception to this rule.) Names whose value is intended to remain constant may be declared readonly. The form of this command is the same as that of the export command, readonly name ... Subsequent attempts to set readonly variables are illegal. 3.2 Parameter substitution If a shell parameter is not set then the null string is substituted for it. For example, if the variable d is not set echo $d or echo ${d} will echo nothing. A default string may be given as in echo ${d-} which will echo the value of the variable d if it is set and ‘. otherwise. evaluated using the usual quoting conventions so that echo ${d-"*"} will echo * if the variable d is not set. Similarly The default string is 4-18 An Introduction to the UNIX Shell echo ${d-$1} will echo the value of d if it is set and the value (if any) of $1 otherwise. A variable may be assigned a default value using the notation echo ${d=.} which substitutes the same string as echo ${d-} and if d were not previously set then it will be set to the string ‘.. not available for positional parameters.) (The notation ${...=...} is If there is no sensible default then the notation echo ${d?7message} will echo the value of the variable d if it has one, otherwise message is printed by the shell and execution of the shell procedure is abandoned. If message is absent then a standard message is printed. A shell procedure that requires some parameters to be set might start as follows. : ${user?} ${acct?} ${bin?} Colon (:) is a command that is built in to the shell and does nothing once its arguments have been evaluated. If any of the variables user, acct or bin are not set then the shell will abandon execution of the procedure. 3.3 Command substitution The standard output from a command can be substituted in a similar way to parameters. The command pwd prints on its standard output the name of the current directory. For example, if the current directory is /usr/fred/bin then the command d="pwd’ is equivalent to d=/usr/fred/bin The entire string between grave accents ('...) is taken as the command to be executed and is replaced with the output from the command. The command is written using the usual quot- ing conventions except that a * must be escaped using a\. For example, Is “‘echo ”$1TM is equivalent to Is $1 Command substitution occurs in all contexts where parameter substitution occurs (including here documents) and the treatment of the resulting text is the same in both cases. This mechanism allows string processing commands to be used within shell procedures. An example of such a command is basename which removes a specified suffix from a string. For exam- ple, basename main.c .c will print the string main. mand. Its use is illustrated by the following fragment from a cc com- An Introduction to the UNIX Shell 4-19 case $A in *,¢c) B='basename $A .c esac that sets B to the part of $A with the suffix .c stripped. Here are some composite examples. foriin ls—t; do... The variable i is set to the names of files in time order, most recent first. set ‘date’; echo $6 $2 $3, $4 will print, e.g., 1977 Nov 1, 23:59:59 3.4 Evaluation and quoting The shell is a macro processor that provides parameter substitution, command substitution and file name generation for the arguments to commands. This section discusses the order in which these evaluations occur and the effects of the various quoting mechanisms. Commands are parsed initially according to the grammar given in appendix A. Before a com- mand is executed the following substitutions occur. parameter substitution, e.g. $user command substitution, e.g. ‘pwd’ Only one evaluation occurs so that if, for example, the value of the variable X is the string $y then echo $X will echo $y. blank interpretation Following the above substitutions the resulting characters are broken into non- blank words (blank interpretation). the string $IFS. For this purpose ‘blanks’ are the characters of By default, this string consists of blank, tab and newline. null string is not regarded as a word unless it is quoted. The For example, echo ” will pass on the null string as the first argument to echo, whereas echo $null will call echo with no arguments if the variable null is not set or set to the null string. file name generation Each word is then scanned for the file pattern characters %, ? and [...] and an alphabetical list of file names is generated to replace the word. Each such file name is a separate argument. The evaluations just described also occur in the list of words associated with a for loop. Only substitution occurs in the word used for a case branch. As well as the quoting mechanisms described earlier using \and "...” a third quoting mechan1sm is provided using double quotes. Within double quotes parameter and command substitu- tion occurs but file name generation and the interpretation of blanks does not. The following characters have a special meaning within double quotes and may be quoted using \. 4-20 An Introduction to the UNIX Shell $ parameter substitution command substitution ends the quoted string \ quotes the special characters $ ~” \ For example, echo ”$x” will pass the value of the variable x as a single argument to echo. Similarly, echo "$*” will pass the positional parameters as a single argument and is equivalent to echo "$1 $2 ... The notation $ @ is the same as $* except when it is quoted. echo "$@” will pass the positional parameters, unevaluated, to echo and is equivalent to echo ”$1” 7§27 ... The following table gives, for each quoting mechanism, the shell metacharacters that are evaluated. metacharacter ’ n n n n n t ) y n n t n n ” y y n y t n t terminator y interpreted n not interpreted Figure 2. Quoting mechanisms In cases where more than one evaluation of a string is required the built-in command eval may be used. For example, if the variable X has the value $y, and if y has the value pqr then eval echo $X will echo the string pgr. In general the eval command evaluates its arguments (as do all commands) and treats the result as input to the shell. The input is read and the resulting command(s) executed. For example, / wg=eval who|grep / $wg fred 1s equivalent to whogrep fred In this example, eval is required since there is no interpretation of metacharacters, such as |, following substitution. An Introduction to the UNIX Shell 4-21 3.5 Error handling The treatment of errors detected by the shell depends on the type of error and on whether the shell is being used interactively. An interactive shell is one whose input and output are con- nected to a terminal (as determined by gtty (2)). interactive. A shell invoked with the -i flag is also Execution of a command (see also 3.7) may fail for any of the following reasons. . Input output redirection may fail. created. o ® For example, if a file does not exist or cannot be The command itself does not exist or cannot be executed. The command terminates abnormally, for example, with a "bus error” or “memory See Figure 2 below for a complete list of UNIX signals. fault”. . The command terminates normally but returns a non-zero exit status. In all of these cases the shell will go on to execute the next command. Except for the last case an error message will be printed by the shell. All remaining errors cause the shell to exit from An interactive shell will return to read another command from the Such errors include the following. a command procedure. terminal. e Syntax errors. . A signal such as interrupt. . Failure of any of the built-in commands such as c¢d. e.g., if ... then ... done The shell waits for the current command, if any, to finish execution and then either exits or returns to the terminal. The shell flag -e causes the shell to terminate if any error is detected. 1 hangup 2 interrupt 3* quit 4* illegal instruction 5% trace trap 6* IOT instruction 7* EMT instruction 8* floating point exception 9 kill (cannot be caught or ignored) 10* bus error 11* segmentation violation 12* bad argument to system call 13 write on a pipe with no one to read it 14 alarm clock 15 software termination (from kill (1)) Figure 3. UNIX signals Those signals marked with an asterisk produce a core dump if not caught. itself ignores quit which is the only external signal that can cause a dump. list of potential interest to shell IJ S VNS AR However, the shell The signals in this roal1, 9 2 14 and 1K 2, 3, 14 and 15. 3.6 Fault handling Shell procedures normally terminate when an interrupt is received from the terminal. trap command is used if some cleaning up is required, such as removing temporary files. example, The For trap ‘rm /tmp/ps$$; exit” 2 sets a trap for signal 2 (terminal interrupt), and if this signal is received will execute the 4-22 An Introduction to the UNIX Shell commands rm /tmp/ps$$; exit exit is another built-in command that terminates execution of a shell procedure. The exit is required; otherwise, after the trap has been taken, the shell will resume executing the pro- cedure at the place where it was interrupted. UNIX signals can be handled in one of three ways. signal is never sent to the process. They can be ignored, in which case the They can be caught, in which case the process must decide what action to take when the signal is received. Lastly, they can be left to cause termination of the process without it having to take any further action. If a signal is being ignored on entry to the shell procedure, for example, by invoking it in the background (see 3.7) then trap commands (and the signal) are ignored. The use of trap is illustrated by this modified version of the touch command (Figure 4). The cleanup action is to remove the file junk$$. flag= trap rm —f junk$$; exit” 1 2 3 15 for 1 do case $i in ~C) flag=N ;; *)1f test £ $i then In $i junk$$; rm junk$$ elif test $flag | | then echo file \"$i\ does not exist else >$i fi esac done Figure 4. The touch command The trap command appears before the creation of the temporary file; otherwise it would be possible for the process to die without removing the file. Since there is no signal 0 in UNIX it is used by the shell to indicate the commands to be executed on exit from the shell procedure. A procedure may, itself, elect to ignore signals by specifying the null string as the argument to trap. The following fragment is taken from the nohup command. trap “ 12315 which causes hangup, interrupt, quit and kill to be ignored both by the procedure and by invoked commands. Traps may be reset by saying The procedure scan (Figure 5) is an example of the use of trap where there is no exit in the trap command. scan takes each directory in the current directory, prompts with its name, and then executes commands typed at the terminal until an end of file or an interrupt is received. Interrupts are ignored while executing the requested commands but cause termina- tion when scan is waiting for input. An Introduction to the UNIX Shell 4-23 d="pwd for 1 in * do if test —-d $d/$i then cd $d/$i while echo ”$1:” trap exit 2 read x do trap : 2; eval $x; done fi done Figure 5. The scan command read x is a built-in command that reads one line from the standard input and places the result in the variable x. It returns a non-zero exit status if either an end-of-file is read or an inter- rupt is received. 3.7 Command execution To run a command (other than a built-in) the shell first creates a new process using the sys- tem call fork. The execution environment for the command includes input, output and the states of signals, and is established in the child process before the command is executed. The built-in command exec is used in the rare cases when no fork is required and simply replaces the shell with a new command. For example, a simple version of the nohup command looks like trap “ 12315 exec $x The trap turns off the signals specified so that they are ignored by subsequently created commands and exec replaces the shell by the command specified. Most forms of input output redirection have already been described. only subject to parameter and command substitution. In the following word is No file name generation or blank interpretation takes place so that, for example, echo ... >*.c will write its output into a file whose name is *.c. Input output specifications are evaluated left to right as they appear in the command. > word The standard output (file descriptor 1) is sent to the file word which is created if it does not already exist. >> word The standard output is sent to file word. If the file exists then output is appended (by seeking to the end); otherwise the file is created. < word The standard input (file descriptor 0) is taken from the file word. << word The standard input is taken from the lines of shell input that follow up to but not including a line consisting only of word. tation of the document occurs. If word is quoted then no interpre- If word is not quoted then parameter and com- mand substitution occur and \ is used to quote the characters \ $ * and the first character of word. In the latter case\newline is ignored (c.f. quoted strings).>& digit The file descriptor digit is duplicated using the system call dup (2) and the result is used as the standard output. <& digit The standard input is duplicated from file descriptor digit. <&-— The standard input is closed. 4-24 An Introduction to the UNIX Shell >&- The standard output is closed. Any of the above may be preceded by a digit in which case the file descriptor created is that specified by the digit instead of the default 0 or 1. For example, .. 2>file runs a command with message output (file descriptor 2) directed to file. . 2>&1 runs a command with its standard output and message output merged. (Strictly speaking file descriptor 2 is created by duplicating file descriptor 1 but the effect is usually to merge the two streams.) The environment for a command run in the background such as list *.c | lpr & is modified in two ways. file /dev/null. Firstly, the default standard input for such a command is the empty This prevents two processes (the shell and the command), which are running in parallel, from trying to read the same input. Chaos would ensue if this were not the case. For example, ed file & would allow both the editor and the shell to read from the same input at the same time. The other modification to the environment of a background commandis to turn off the QUIT and INTERRUPT signals so that they are ignored by the command. This allows these signals to be used at the terminal without causing background commands to terminate. For this reason the UNIX convention for a signal is that if it is set to 1 (ignored) then it is never changed even for a short time. Note that the shell command trap has no effect for an ignored signal. 3.8 Invoking the shell The following flags are interpreted by the shell when it is invoked. If the first character of argument zero 1s a minus, then commands are read from the file .profile. -¢ string If the —c flag is present then commands are read from string. —s If the —s flag is present or if no arguments remain then commands are read from the standard input. -4 Shell output is written to file descriptor 2. If the 4 flag is present or if the shell input and output are attached to a terminal (as told by gtty) then this shell is interactive. In this case TERMINATE is ignored (so that kill O does not kill an interactive shell) and INTERRUPT is caught and ignored (so In all cases QUIT is ignored by the shell. that wait is interruptable). Acknowledgements The design of the shell is based in part on the original UNIX shell? and the PWB/UNIX shell,* some features having been taken from both. Similarities also exist with the command interpreters of the Cambridge Multiple Access System® and of CTSS.6 n| I would like to thank Dennis Ritchie and John Masheyfor many discussionsduring the a design of the shell. I am also grateful to the members of the Computing Science Research Center and to Joe Maranzano for their comments on drafts of this document. References 1. B. W. Kernighan, UNIX for Beginners, 1978. An Introduction to the UNIX Shell 4-25 K. Thompson and D. M. Ritchie, UNIX Programmer’s Manual, Bell Laboratories, 1978. Seventh Edition. K. Thompson, “The UNIX Command Language,” in Structured Programming—Infotech State of the Art Report, pp. 375-384, Infotech International Ltd., Nicholson House, Maidenhead, Berkshire, England, March 1975. J. R. Mashey, PWB/UNIX Shell Tutorial, September 30, 1977. D. F. Hartley (Ed.), The Cambridge Multiple Access System — Users Reference Manual, University Mathematical Laboratory, Cambridge, England, 1968. P. A. Crisman (Ed.), The Compatible Time-Sharing System, M.L'T. Press, Cambridge, Mass., 1965. 4-26 An Introduction to the UNIX Shell Appendix A - Grammar item: word input-output name = value simple-command: item simple-command item command. simple-command ( command-list ) { command-list } for name do command-list done for name in word ... do command-list done while command-list do command-list done until command-list do command-list done case word in case-part ... esac if command-list then command-list else-part fi pipeline: command pipeline | command andor: pipeline andor && pipeline andor || pipeline command-list: andor command-list ; command-list & command-list ; andor command-list & andor input-output: > file < file >> word << word file: word & digit & - case-part: pattern ) command-list 3 pattern: word pattern | word else-part: elif command-list then command-list else-part else command-list empty empty word. name: digit: a sequence of non-blank characters a sequence of letters, digits or underscores starting with a letter 0123456789 An Introduction to the UNIX Shell 4-27 Appendix B - Meta-characters and Reserved Words a) syntactic | pipe symbol && ‘andf’ symbol || ‘orf’ symbol ; command separator 5 case delimiter & background commands () command grouping < input redirection << input from a here document > output creation >> output append b) patterns % match any character(s) including none ? match any single character [...] match any of the enclosed characters c) substitution ${...} substitute shell variable “ substitute command output N\ d) quoting quote the next character \ quote the enclosed characters except for ’ ”...” quote the enclosed characters except for § ° \” e) reserved words if then else elif fi case in esac for while until do done { Introduction to the C Shell 4-29 An introduction to the C shell William Joy Computer Science Division Department of Electrical Engineering and Computer Science University of California, Berkeley Berkeley, California 94720 Introduction A shell is a command language interpreter. Csh is the name of one particular command interpreter on UNIX. The primary purpose of csh is to translate command lines typed at a terminal into system actions, such as invocation of other programs. Csh is a user program just like any you might write. Hopefully, csh will be a very useful program for you in interacting with the UNIX system. In addition to this document, you will want to refer to a copy of the UNIX programmer’s manual. The csh documentation in the manual provides a full description of all features of the shell and is a final reference for questions about the shell. Many words in this document are shown in italics. These are important words; names of commands, and words which have special meaning in discussing the shell and UNIX. Many of the words are defined in a glossary at the end of this document. If you don’t know what is meant by a word, you should look for it in the glossary. Acknowledgements Numerous people have provided good input about previous versions of csh and aided in its debugging and in the debugging of its documentation. I would especially like to thank Michael Ubell who made the crucial observation that history commands could be done well over the word structure of input text, and implemented a prototype history mechanism in an older version of the shell. Eric Allman has also provided a large number of useful comments on the shell, helping to unify those concepts which are present and to identify and eliminate useless and marginally useful features. Mike O’Brien suggested the pathname hashing mechanism which speeds command execution. Jim Kulp added the job control and directory stack primitives and added their documentation to this introduction. 4-30 1. Introduction to the C Shell Terminal usage of the shell 1.1. The basic notion of commands A shell in UNIX acts mostly as a medium through which other programs are invoked. While it has a set of builtin functions which it performs directly, most commands cause exe- cution of programs that are, in fact, external to the shell. The shell is thus distinguished from the command interpreters of other systems both by the fact that it is just a user program, and by the fact that it is used almost exclusively as a mechanism for invoking other programs. Commands in the UNIX system consist of a list of strings or words interpreted as a com- mand name followed by arguments. Thus the command mail bill consists of two words. The first word mail names the command to be executed, in this case the mail program which sends messages to other users. The shell uses the name of the com- mand in attempting to execute it for you. It will look in a number of directories for a file with the name mail which is expected to contain the mail program. The rest of the words of the command are given as arguments to the command itself In this case we specified also the argument bill which is interpreted by when it is executed. the mail program to be the name of a user to whom mail is to be sent. In normal terminal usage we might use the mail command as follows. % mail bill I have a question about the csh documentation. My document seems to be missing page 5. Does a page five exist? Bill EOT % Here we typed a message to send to bill and ended this message with a 1D which sent an end-of-file to the mail program. (Here and throughout this document, the notation “tx” is to be read “control-x” and represents the striking of the x key while the control key is held down.) The mail program then echoed the characters ‘EOT’ and transmitted our message. The characters ‘% ’ were printed before and after the mail command by the shell to indicate that input was needed. After typing the ‘% ’ prompt the shell was reading command input from our terminal. We typed a complete command ‘mail bill’. The shell then executed the mail program with argument bill and went dormant waiting for it to complete. The mail program then read input from our terminal until we signalled an end-of-file via typing a 1D after which the shell noticed that mail had completed and signaled us that it was ready to read from the terminal again by printing another ‘% ’ prompt. This is the essential pattern of all interaction with UNIX through the shell. A complete command is typed at the terminal, the shell executes the command and when this execution completes, it prompts for a new command. If you run the editor for an hour, the shell will patiently wait for you to finish editing and obediently prompt you again whenever you finish editing. An example of a useful command you can execute now is the tset command, which sets the default erase and kill characters on your terminal — the erase character erases the last character you typed and the kill character erases the entire line you have entered so far. default, the erase character is ‘#’ and the kill character is ‘@. By Most people who use CRT displays prefer to use the backspace (TH) character as their erase character since it is then easler to see what you have typed so far. You can make this be true by typing Introduction to the C Shell 4-31 tset —e which tells the program tset to set the erase character, and its default setting for this character is a backspace. 1.2. Flag arguments A useful notion in UNIX is that of a flag argument. While many arguments to commands specify file names or user names some arguments rather specify an optional capability of the command which you wish to invoke. By convention, such arguments begin with the character ‘~’> (hyphen). Thus the command Is will produce a list of the files in the current working directory. The option —s is the size option, and ls —s causes Is to also give, for each file the size of the file in blocks of 512 characters. The manual section for each command in the UNIX reference manual gives the available options for each command. The Is command has a large number of useful and interesting options. Most other commands have either no options or only one or two options. It is hard to remember options of commands which are not used very frequently, so most UNIX utilities perform only one or two functions rather than having a large number of hard to remember options. 1.3. Output to files Commands that normally read input or write output on the terminal can also be exe- cuted with this input and/or output done to a file. Thus suppose we wish to save the current date in a file called ‘now’. The command date will print the current date on our terminal. This is because our terminal is the default standard output for the date command and the date command prints the date on its standard output. The shell lets us redirect the standard output of a command through a notation using the metacharacter ‘>’ and the name of the file where output is to be placed. Thus the command date > now runs the date command such that its standard output is the file ‘now’ rather than the terminal. Thus this command places the current date and time into the file ‘now’. It is important to know that the date command was unaware that its output was going to a file rather than to the terminal. The shell performed this redirection before the command began executing. One other thing to note here is that the file ‘now’ need not have existed before the date command was executed; the shell would have created the file if it did not exist. And if the file did exist? If it had existed previously these previous contents would have been discarded! A shell option noclobber exists to prevent this from happening accidentally; it is discussed in section 2.2. The system normally keeps files which you create with >’ and all other files. Thus the default is for files to be permanent. If you wish to create a file which will be removed automatically, you can begin its name with a ‘#’ character, this ‘scratch’ character denotes the fact that the file will be a scratch file.* The system will remove such files after a couple of *Note that if your erase character is a ‘#’, you will have to precede the ‘#’ with a ‘X. The fact that the ‘#’ character is the old (pre-crT) standard erase character means that it seldom appears in a file name, and allows this convention to be used for scratch files. If you are using a CRT, your erase character should be a fH, as we demonstrated in section 1.1 how this could be set up. 4-32 Introduction to the C Shell days, or sooner if file space becomes very tight. Thus, in running the date command above, we don’t really want to save the output forever, so we would more likely do date > #now 1.4. Metacharacters in the shell The shell has a large number of special characters (like ‘>’) which indicate special functions. We say that these notations have syntactic and semantic meaning to the shell. In gen- eral, most characters which are neither letters nor digits have special meaning to the shell. We shall shortly learn a means of quotation which allows us to use metacharacters without the shell treating them in any special way. Metacharacters normally have effect only when the shell is reading our input. We need not worry about placing shell metacharacters in a letter we are sending via mail, or when we are typing in text or data to some other program. Note that the shell is only reading input when it has prompted with ‘% . 1.5. Input from files; pipelines We learned above how to redirect the standard output of a command to a file. It is also possible to redirect the standard input of a command from a file. This is not often necessary since most commands will read from a file whose name is given as an argument. We can give the command sort < data to run the sort command with standard input, where the command normally reads its input, from the file ‘data’. We would more likely say sort data letting the sort command open the file ‘data’ for input itself since this is less to type. We should note that if we just typed sort then the sort program would sort lines from its standard input. Since we did not redirect the standard input, it would sort lines as we typed them on the terminal until we typed a 1D to indicate an end-of-file. A most useful capability is the ability to combine the standard output of one command with the standard input of another, i.e. to run the commands in a sequence known as a pipeline. For instance the command Is —s normally produces a list of the files in our directory with the size of each in blocks of 512 characters. If we are interested in learning which of our files is largest we may wish to have this sorted by size rather than by name, which is the default way in which Is sorts. We could look at the many options of Is to see if there was an option to do this but would eventually discover that there is not. Instead we can use a couple of simple options of the sort com- mand, combining it with Is to get what we want. The —n option of sort specifies a numeric sort rather than an alphabetic sort. Thus Is —s |sort —n specifies that the output of the Is command run with the option —s is to be piped to the command sort run with the numeric sort option. size, but with the smallest first. This would give us a sorted list of our files by We could then use the —r reverse sort option and the head command in combination with the previous command doin - a Introduction to the C Shell 4-33 ls —s | sort —n —r |head =5 Here we have taken a list of our files sorted alphabetically, each with the size in blocks. We have run this to the standard input of the sori command asking it to sort numerically in reverse order (largest first). This output has then been run into the command head which gives us the first few lines. In this case we have asked head for the first 5 lines. Thus this command gives us the names and sizes of our 5 largest files. The notation introduced above is called the pipe mechanism. Commands separated by ‘I’ characters are connected together by the shell and the standard output of each is run into the standard input of the next. The leftmost command in a pipeline will normally take its standard input from the terminal and the rightmost will place its standard output on the terminal. Other examples of pipelines will be given later when we discuss the history mechanism; one important use of pipes which is illustrated there is in the routing of information to the line printer. 1.6. Filenames Many commands to be executed will need the names of files as arguments. UNIX pathnames consist of a number of components separated by ‘/’. Each component except the last names a directory in which the next component resides, in effect specifying the path of direc- tories to follow to reach the file. Thus the pathname /etc/motd specifies a file in the directory ‘etc’ which is a subdirectory of the root directory ‘’.. Within this directory the file named is ‘motd’ which stands for ‘message of the day’. A pathname that begins with a slash is said to be an absolute pathname since it is specified from the abso- lute top of the entire directory hierarchy of the system (the root). Pathnames which do not begin with ¢/’ are interpreted as starting in the current working directory, which is, by default, your home directory and can be changed dynamically by the cd change directory command. Such pathnames are said to be relative to the working directory since they are found by starting in the working directory and descending to lower levels of directories for each component of the pathname. If the pathname contains no slashes at all then the file is contained in the working directory itself and the pathname is merely the name of the file in this directory. Absolute pathnames have no relation to the working directory. Most filenames consist of a number of alphanumeric characters and ‘.’s (periods). In fact, all printing characters except ¢/’ (slash) may appear in filenames. It is inconvenient to have most non-alphabetic characters in filenames because many of these have special meaning to the shell. The character ‘.’ (period) is not a shell-metacharacter and is often used to separate the extension of a file name from the base of the name. Thus prog.c prog.o prog.errs prog.output are four related files. They share a base portion of a name (a base portion being that part of the name that is left when a trailing ‘.’ and following characters which are not .’ are stripped off). The file ‘prog.c’ might be the source for a C program, the file ‘prog.o’ the corresponding object file, the file ‘prog.errs’ the errors resulting from a compilation of the program and the file ‘prog.output’ the output of a run of the program. If we wished to refer to all four of these files in a command, we could use the notation prog.* This word is expanded by the shell, before the command to which it is an argument is executed, into a list of names which begin with ‘prog.’. The character ‘*’ here matches any sequence (including the empty sequence) of characters in a file name. The names which match are alphabetically sorted and placed in the argument list of the command. Thus the command 4-34 Introduction to the C Shell echo prog.* will echo the names prog.c prog.errs prog.o prog.output Note that the names are in sorted order here, and a different order than we listed them above. The echo command receives four words as arguments, even though we only typed one word as as argument directly. The four words were generated by filename expansion of the one input word. Other notations for filename expansion are also available. any single character in a filename. The character “)’ matches Thus echo ? 77 777 will echo a line of filenames; first those with one character names, then those with two charac- ter names, and finally those with three character names. The names of each length will be independently sorted. Another mechanism consists of a sequence of characters between ‘[’ and ¢]’. metasequence matches any single character from the enclosed set. This Thus prog.[co] will match prog.c prog.o in the example above. We can also place two characters around a ‘=’ in this notation to denote a range. Thus chap.[1-5] might match files chap.1 chap.2 chap.3 chap.4 chap.5 if they existed. This is shorthand for chap.[12345] and otherwise equivalent. An important point to note is that if a list of argument words to a command (an argument list) contains filename expansion syntax, and if this filename expansion syntax fails to match any existing file names, then the shell considers this to be an error and prints a diag- nostic No match. and does not execute the command. Another very important point is that files with the character ‘.’ at the beginning are €9 treated specially. Neither “*’ or ‘?° or the ‘[’ ‘" mechanism will match it. This prevents accidental matching of the filenames ‘.’ and ‘..’ in the working directory which have special meaning to the system, as well as other files such as .cshrc¢ which are not normally visible. We will discuss the special role of the file .cshrc later. Another filename expansion mechanism gives access to the pathname of the home directory of other users. users’ login name. This notation consists of the character ‘” (tilde) followed by another For instance the word “bill’ would map to the pathname ‘/usr/bill’ if the home directory for ‘bill’ was ‘/usr/bill’. Since, on large systems, users may have login direc- tories scattered over many different disk volumes with different prefix directory names, this notation provides a reliable way of accessing the files of other users. Introduction to the C Shell 4-35 A special case of this notation consists of a £~ alone, e.g. “/mbox’. This notation is expanded by the shell into the file ‘mbox’ in your home directory, i.e. into ‘/usr/bill/mbox’ for me on Ernie Co-vax, the UCB Computer Science Department VAX machine, where this document was prepared. This can be very useful if you have used cd to change to another direc- tory and have found a file you wish to copy using c¢p. If I give the command cp thatfile ~ the shell will expand this command to cp thatfile /usr/bill since my home directory is /usr/bill. There also exists a mechanism using the characters ‘{’ and ‘}’ for abbreviating a set of words which have common parts but cannot be abbreviated by the above mechanisms because they are not files, are the names of files which do not yet exist, are not thus conveniently described. This mechanism will be described much later, in section 4.2, as it is used less fre- quently. 1.7. Quotation We have already seen a number of metacharacters used by the shell. These metacharac- ters pose a problem in that we cannot use them directly as parts of words. Thus the com- mand echo * will not echo the character ‘*’. It will either echo an sorted list of filenames in the current working directory, or print the message ‘No match’ if there are no files in the working directory. The recommended mechanism for placing characters which are neither numbers, digits, ‘/’, ¢ or ‘=’ in an argument word to a command is to enclose it with single quotation characters 7, i.e. echo TM There is one special character ‘I’ which is used by the history mechanism of the shell and which cannot be escaped by placing it within ‘” characters. It and the character ” itself can be preceded by a single ‘\_ to prevent their special meaning. Thus echo \'\! prints ! These two mechanisms suffice to place any printing character into a word which is an argument to a shell command. They can be combined, as in echo \"* which prints VNN g S since the first ‘\ escaped the first “’ and the “*’ was enclosed between “ characters. 1.8. Terminating commands When you are executing a command and the shell is waiting for it to complete there are several ways to force it to stop. For instance if you type the command the system will print a copy of a list of all users of the system on your terminal. This is likely 4-36 Introduction to the C Shell to continue for several minutes unless you stop it. You can send an INTERRUPT signal to the cat command by typing the DEL or RUBOUT key on your terminal.* Since cat does not take any precautions to avoid or otherwise handle this signal the INTERRUPT will cause it to terminate. The shell notices that cat has terminated and prompts you again with ‘% ’. If you hit INTERRUPT again, the shell will just repeat its prompt since it handles INTERRUPT signals and chooses to continue to execute commands rather than terminating like cat did, which would have the effect of logging you out. Another way in which many programs terminate is when they get an end-of-file from their standard input. Thus the mail program in the first example above was terminated when we typed a 1D which generates an end-of-file from the standard input. The shell also ter- minates when it gets an end-of-file printing ‘logout’; UNIX then logs you off the system. Since this means that typing too many 1D’s can accidentally log us off, the shell has a mechanism for preventing this. This ignoreeof option will be discussed in section 2.2. If a command has its standard input redirected from a file, then it will normally terminate when it reaches the end of this file. Thus if we execute mail bill < prepared.text the mail command will terminate without our typing a 1D. This is because it read to the end-of-file of our file ‘prepared.text’ in which we placed a message for ‘bill’ with an editor program. We could also have done cat prepared.text | mail bill since the cat command would then have written the text through the pipe to the standard input of the mail command. When the cat command completed it would have terminated, closing down the pipeline and the mail command would have received an end-of-file from it and terminated. Using a pipe here is more complicated than redirecting input so we would more likely use the first form. These commands could also have been stopped by sending an INTERRUPT. Another possibility for stopping a command is to suspend its execution temporarily, with the possibility of continuing execution later. aTZ. This is done by sending a STOP signal via typing This signal causes all commands running on the terminal (usually one but more if a pipeline is executing) to become suspended. The shell notices that the command(s) have been suspended, types ‘Stopped’ and then prompts for a new command. The previously executing command has been suspended, but otherwise unaffected by the STOP signal. mands can be executed while the original command remains suspended. mand can be continued using the fg command with no arguments. Any other com- The suspended com- The shell will then retype the command to remind you which command is being continued, and cause the command to resume execution. Unless any input files in use by the suspended command have been changed in the meantime, the suspension has no effect whatsoever on the execution of the command. This feature can be very useful during editing, when you need to look at another file before continuing. An example of command suspension follows. *Many users use stty(1) to change the interrupt character to1C. Introduction to the C Shell 4-37 % mail harold 1?omeone just copied a big file into my directory and its name is Z Stopped % ls funnyfile prog.c prog.o % jobs + Stopped [1] mail harold % fg mail harold funnyfile. Do you know who did it? EOT % In this example someone was sending a message to Harold and forgot the name of the file he wanted to mention. The mail command was suspended by typing 1Z. When the shell noticed that the mail program was suspended, it typed ‘Stopped’ and prompted for a new command. Then the s command was typed to find out the name of the file. The jobs command was run to find out which command was suspended. At this time the f¢g command was typed to continue execution of the mail program. Input to the mail program was then continued and ended with a TD which indicated the end of the message at which time the mail program typed EOT. The jobs command will show which commands are suspended. The 1Z should only be typed at the beginning of a line since everything typed on the current line is discarded when a signal is sent from the keyboard. This also happens on INTERRUPT, and QUIT signals. More information on suspending jobs and controlling them is given in section 2.6. If you write or run programs which are not fully debugged then it may be necessary to stop them somewhat ungracefully. This can be done by sending them a QUIT signal, sent by typing aT\: This will usually provoke the shell to produce a message like: Quit (Core dumped) indicating that a file ‘core’ has been created containing information about the program ‘a.out’s state when it terminated due to the QUIT signal. You can examine this file yourself, or for- ward information to the maintainer of the program telling him/her where the core file is. If you run background commands (as explained in section 2.6) then these commands will ignore INTERRUPT and QUIT signals at the terminal. mand. To stop them you must use the kill com- See section 2.6 for an example. If you want to examine the output of a command without having it move off the screen as the output of the cat /etc/passwd command will, you can use the command more /etc/passwd The more program pauses after each complete screenful and types ‘——More——" at which point you can hit a space to get another screenful, a return to get another line, or a ‘q’ to end the more program. You can also use more as a filter, i.e. cat /etc/passwd | more works just like the more simple more command above. For stopping output of commands not involving more you can use the 1S key to stop the typeout. The typeout will resume when you hit 1Q or any other key, but 1Q is normally used because it only restarts the output and does not become input to the program which is 4-38 Introduction to the C Shell running. This works well on low-speed terminals, but at 9600 baud it is hard to type #S and 1Q fast enough to paginate the output nicely, and a program like more is usually used. An additional possibility is to use the 1O flush output character; when this character is typed, all output from the current command is thrown away (quickly) until the next input read occurs or until the next shell prompt. This can be used to allow a command to complete without having to suffer through the output on a slow terminal; 10 is a toggle, so flushing can be turned off by typing 1O again while output is being flushed. 1.9. What now? We have so far seen a number of mechanisms of the shell and learned a lot about the way in which it operates. The remaining sections will go yet further into the internals of the shell, but you will surely want to try using the shell before you go any further. To try it you can log in to UNIX and type the following command to the system: chsh myname /bin/csh Here ‘myname’ should be replaced by the name you typed to the system prompt of ‘login:’ to get onto the system. Thus I would use ‘chsh bill /bin/csh’. once; it takes effect at next login. You only have to do this You are now ready to try using csh. Before you do the ‘chsh’ command, the shell you are using when you log into the system is ‘/bin/sh’. In fact, much of the above discussion is applicable to ‘/bin/sh’. The next section will introduce many features particular to csh so you should change your shell to c¢sh before you begin reading it. Introduction to the C Shell 4-39 2. Details on the shell for terminal users 2.1. Shell startup and termination When you login, the shell is started by the system in your home directory and begins by reading commands from a file .cshrc in this directory. your terminal session will read from this file. usefully placed there. All shells which you may start during We will later see what kinds of commands are For now we need not have this file and the shell does not complain about its absence. A login shell, executed after you login to the system, will, after it reads commands from .cshrc, read commands from a file .login also in your home directory. This file contains commands which you wish to do each time you login to the UNIX system. My .login file looks something like: set ignoreeof set mail=(/usr/spool/mail/bill) echo "${prompt}users” ; users alias ts \ ‘set noglob ; eval ‘tset —s —m dialup:c100rvdpna —m plugboard:?hp2621nl **; ts; stty intr1 C kill U crt set time=15 history=10 msgs —f if (—e $mail) then echo "${prompt }mail” mail endif This file contains several commands to be executed by UNIX each time I login. is a set command which is interpreted directly by the shell. ignoreeof which causes the shell to not log me off if I hit mand to log off of the system. incoming mail to me. D. The first It sets the shell variable Rather, I use the logout com- By setting the mail variable, I ask the shell to watch for Every 5 minutes the shell looks for this file and tells me if more mail has arrived there. An alternative to this is to put the command biff y in place of this set; this will cause me to be notified immediately when mail arrives, and to be shown the first few lines of the new message. Next I set the shell variable ‘time’ to ‘15’ causing the shell to automatically print out statistics lines for commands which execute for at least 15 seconds of CPU time. The variable ‘history’ is set to 10 indicating that I want the shell to remember the last 10 commands I type in its history list, (described later). I create an alias “ts” which executes a tset (1) command setting up the modes of the terminal. The parameters to tset indicate the kinds of terminal which I usually use when not on a hardwired port. I then execute “ts” and also use the stty command to change the interrupt character to 1T C and the line kill character to 1 U. I then run the ‘msgs’ program, which provides me with any system messages which I have not seen before; the ‘—f” option here prevents it from telling me anything if there are no ‘new messages. Finally, if my mailbox file exists, then I run the ‘mail’ program to process my mail. When the ‘mail’ and ‘msgs’ programs finish, the shell will finish processing my .login file and begin reading commands from the terminal, prompting for each with ‘% ’. When I log off (by giving the logout command) the shell will print ‘logout’ and execute commands from the file ‘logout’ if it exists in my home directory. After that the shell will terminate and UNIX will log me off the system. If the system is not going down, I will receive a new login message. In any case, after the ‘logout’ message the shell is committed to terminating and will take no 4-40 Introduction to the C Shell further input from my terminal. 2.2. Shell variables The shell maintains a set of variables. We saw above the variables history and time which had values ‘10’ and ‘15’. In fact, each shell variable has as value an array of zero or more strings. Shell variables may be assigned values by the set command. It has several forms, the most useful of which was given above and is set name=value Shell variables may be used to store values which are to be used in commands later through a substitution mechanism. The shell variables most commonly referenced are, however, those which the shell itself refers to. By changing the values of these variables one can directly affect the behavior of the shell. One of the most important variables is the variable path. This variable contains a sequence of directory names where the shell searches for commands. The set command with no arguments shows the value of all variables currently defined (we usually say set) in the shell. The default value for path will be shown by set to be % set argv 0 cwd home path /usr/bill /usr/bill (. /usr/ucb /bin /usr/bin) prompt % shell /bin/csh status 0 term c100rv4pna user bill % This output indicates that the variable path points to the current directory ‘.’ and then ‘/usr/uch’, ‘/bin’ and ‘/usr/bin’. Commands which you may write might be in ‘.’ (usually one of your directories). Commands developed at Berkeley, live in ‘/usr/ucb’ while commands developed at Bell Laboratories live in ‘/bin’ and ‘/usr/bin’. A number of locally developed programs on the system live in the directory ‘/usr/local’. If we wish that all shells which we invoke to have access to these new programs we can place the command set path=(. /usr/ucb /bin /usr/bin /usr/local) in our file .cshrc in our home directory. Try doing this and then logging out and back in and do set again to see that the value assigned to path has changed. One thing you should be aware of is that the shell examines each directory which you insert into your path and determines which commands are contained there. Except for the current directory ‘., which the shell treats specially, this means that if commands are added to a directory in your search path after you have started the shell, they will not necessarily be found by the shell. If you wish to use a command which has been added in this way, you should give the command rehash to the shell, which will cause it to recompute its internal table of command locations, so that it will find the newly added command. Since the shell has to look in the current directory ‘.’ Introduction to the C Shell 4-41 on each command, placing it at the end of the path specification usually works equivalently and reduces overhead. Other useful built in variables are the variable home which shows your home directory, cwd which contains your current working directory, the variable ignoreeof which can be set in your .login file to tell the shell not to exit when it receives an end-of-file from a terminal (as described above). The variable ‘ignoreeof’ is one of several variables which the shell does not care about the value of, only whether they are set or unset. Thus to set this variable you simply do set ignoreeof and to unset it do unset ignoreeof These give the variable ‘ignoreeof’ no value, but none is desired or required. Finally, some other built-in shell variables of use are the variables noclobber and mail. The metasyntax > filename which redirects the standard output of a command will overwrite and destroy the previous contents of the named file. In this way you may accidentally overwrite a file which is valu- able. If you would prefer that the shell not overwrite files in this way you can set noclobber in your .login file. Then trying to do date > now would cause a diagnostic if ‘now’ existed already. You could type date >! now if you really wanted to overwrite the contents of ‘now’. The ‘>!" is a special metasyntax indi- cating that clobbering the file is ok.t 2.3. The shell’s history list The shell can maintain a history list into which it places the words of previous commands. It is possible to use a notation to reuse commands or words from commands in form- ing new commands. This mechanism can be used to repeat previous commands or to correct minor typing mistakes in commands. The following figure gives mechanism of the shell. a sample session involving typical usage of the history In this example we have a very simple C program which has a bug (or two) in it in the file ‘bug.c’, which we ‘cat’ out on our terminal. We then try to run the C compiler on it, referring to the file again as ‘!$’, meaning the last argument to the previous command. Here the ‘!’ is the history mechanism invocation metacharacter, and the ‘$’ stands for the last argument, by analogy to ‘¢’ in the editor which stands for the end of the line. The shell echoed the command, as it would have been typed without use of the history mechanism, and then executed it. The compilation yielded error diagnostics so we now run the editor on the file we were trying to compile, fix the bug, and run the C compiler again, this time referring to this command simply as ‘I¢’, which repeats the last command which started with the letter ‘c’. If there were other commands starting with ‘c’ done recently we could have said ‘!cc’ or even ‘lcc:p’ which would have printed the last command starting with ‘cc’ without executing it. TThe space between the ‘" and the word ‘now’ is critical here, as ‘‘'now’ would be an invocation of the historv mechanism, and have a totally different effect. Introduction to the C Shell % cat bug.c main() { printf("hello); ) % cc'$ cc bug.c - "bug.c”, line 4. newline in string or char constant "bug.c”, line 5: syntax error % ed !$ ed bug.c 29 4s/);/”&/p printf("hello”); W 30 q % lc cc bug.c % a.out hello% le ed bug.c 30 4s/lo/1o\\n/p printf("hello.n”); w 32 q % lc —o bug cc bug.c —o bug % size a.out bug a.out: 2784+364+1028 = 4176b = 0x1050b bug: 2784+364+1028 = 4176b = 0x1050b % ls —11* Is =1 a.out bug —rwxr-xr-x 1 bill 3932 Dec 19 09:41 a.out -rwxr-xr—x 1 bill 3932 Dec 19 09:42 bug % bug hello % num bug.c | spp spp: Command not found. % 1 spptssp num bug.c | ssp main() ( 1 T = W 1 printf(”hello\'n”); } % ! llpr i 4-42 num bug.c |ssp | lpr % Introduction to the C Shell 4-43 After this recompilation, we ran the resulting ‘a.out’ file, and then noting that there still was a bug, ran the editor again. After fixing the program we ran the C compiler again, but tacked onto the command an extra ‘—o bug’ telling the compiler to place the resultant binary in the file ‘bug’ rather than ‘a.out’. In general, the history mechanisms may be used anywhere in the formation of new commands and other characters may be placed before and after the - substituted commands. We then ran the ‘size’ command to see how large the binary program images we have created were, and then an ‘Is =1’ command with the same argument list, denoting the argument list ‘*’. Finally we ran the program ‘bug’ to see that its output is indeed correct. To make a numbered listing of the program we ran the ‘num’ command on the file ‘bug.c’. In order to compress out blank lines in the output of ‘num’ we ran the output through the filter ‘ssp’, but misspelled it as spp. To correct this we used a shell substitute, placing the old text and new text between ‘1’ characters. This is similar to the substitute command in the editor. Finally, we repeated the same command with ‘", but sent its output to the line printer. There are other mechanisms available for repeating commands. The history command prints out a number of previous commands with numbers by which they can be referenced. There is a way to refer to a previous command by searching for a string which appeared in it, and there are other, less useful, ways to select arguments to include in a new command. A complete description of all these mechanisms is given in the C shell manual pages in the UNIX Programmers Manual. 2.4. Aliases The shell has an alias mechanism which can be used to make transformations on input commands. This mechanism can be used to simplify the commands you type, to supply default arguments to commands, or to perform transformations on commands and their arguments. The alias facility is similar to a macro facility. Some of the features obtained by alias- ing can be obtained also using shell command files, but these take place in another instance of the shell and cannot directly affect the current shells environment or involve commands such as cd which must be done in the current shell. As an example, suppose that there is a new version of the mail program on the system called ‘newmail’ you wish to use, rather than the standard mail program which is called ‘mail’. If you place the shell command alias mail newmail in your .cshrc file, the shell will transform an input line of the form mail bill into a call on ‘newmail’. More generally, suppose we wish the command ‘ls’ to always show sizes of files, that is to always do ‘—s’. We can do alias Is Is —s or even alias dir Is — creating a new command syntax ‘dir’ which does an ‘lIs —s’. If we say dir “bill then the shell will translate this to Is —s /mnt/bill hus the alias mechanism can beused to provide short names for commands, to provide default arguments, and to definenew short commandsin terms of other commands. It is also 4-44 Introduction to the C Shell possible to define aliases which contain multiple commands or pipelines, showing where the arguments to the original command are to be substituted using the facilities of the history mechanism. Thus the definition alias c¢d “cd \!* ; 1s~ would do an Is command after each change directory cd command. We enclosed the entire alias definition in *” characters to prevent most substitutions from occurring and the character ;’ from being recognized as a metacharacter. The ‘!’ here is escaped with a ‘X to prevent it from being interpreted when the alias command is typed in. The ‘X*’ here substitutes the entire argument list to the pre-aliasing ¢d command, without giving an error if there were no arguments. The ‘;’ separating commands is used here to indicate that one command is to be done and then the next. Similarly the definition alias whois “grep \! T /etc/passwd’ defines a command which looks up its first argument in the password file. Warning: The shell currently reads the .cshre file each time it starts up. If you place a large number of commands there, shells will tend to start slowly. A mechanism for saving the shell environment after reading the .cshre file and quickly restoring it is under development, but for now you should try to limit the number of aliases you have to a reasonable number... 10 or 15 is reasonable, 50 or 60 will cause a noticeable delay in starting up shells, and make the system seem sluggish when you execute commands from within the editor and other programs. 2.5. More redirection; >> and >& There are a few more notations useful to the terminal user which have not been intro- duced yet. In addition to the standard output, commands also have a diagnostic output which is normally directed to the terminal even when the standard output is redirected to a file or a pipe. It is occasionally desirable to direct the diagnostic output along with the standard output. For instance if you want to redirect the output of a long running command into a file and wish to have a record of any error diagnostic it produces you can do command >& file The “>&’ here tells the shell to route both the diagnostic output and the standard output into ‘file’. Similarly you can give the command command | & lpr to route both standard and diagnostic output through the pipe to the line printer daemon [pr.it ‘ Finally, it is possible to use the form command >> file to place output at the end of an existing file.t #A command form command >&! file exists, and is used when noclobber is set and file already exists. TIf noclobber is set, then an error will result if file does not exist, otherwise the shell will create file if it doesn’t exist. A form command >>! file makes it not be an error for file to not exist when noclobber is set. Introduction to the C Shell 4-45 2.6. Jobs; Background, Foreground, or Suspended When one or more commands are typed together as a pipeline or as a sequence of commands separated by semicolons, a single job is created by the shell consisting of these commands together as a unit. jobs. Single commands without pipes or semicolons create the simplest Usually, every line typed to the shell creates a job. Some lines that create jobs (one per line) are sort < data Is —s | sort —n| head =5 mail harold If the metacharacter ‘&’ is typed at the end of the commands, then the job is started as a background job. This means that the shell does not wait for it to complete but immediately prompts and is ready for another command. The job runs in the background at the same time that normal jobs, called foreground jobs, continue to be read and executed by the shell one at a time. Thus du > usage & would run the du program, which reports on the disk usage of your working directory (as well as any directories below it), put the output into the file ‘usage’ and return immediately with a prompt for the next command without out waiting for du to finish. The du program would continue executing in the background until it finished, even though you can type and execute more commands in the mean time. When a background job terminates, a message is typed by the shell just before the next prompt telling you that the job has completed. In the following example the du job finishes sometime during the execution of the mail command and its completion is reported just before the prompt after the mail job is finished. % du > usage & [1] 503 % mail bill How do you know when a background job is finished? EOT [1] — Done | du > usage % If the job did not terminate normally the ‘Done’ message might say something else like ‘Killed’. If you want the terminations of background jobs to be reported at the time they occur (possibly interrupting the output of other foreground jobs), you can set the notify variable. In the previous example this would mean that the ‘Done’ message might have come right in the middle of the message to Bill. Background jobs are unaffected by any signals from the keyboard like the STOP, INTERRUPT, or QUIT signals mentioned earlier. Jobs are recorded in a table inside the shell until they terminate. In this table, the shell remembers the command names, arguments and the process numbers of all commands in the job as well as the working directory where the job was started. Each job in the table is either running in the foreground with the shell waiting for it to terminate, running in the back- ground, or suspended. Only one job can be running in the foreground at one time, but several jobs can be suspended or running in the background at once. As each job is started, it is assigned a small identifying number called the job number which can be used later to refer to the job in the commands described below. Job numbers remain the same until the job ter- minates and then are re-used. When a job is started in the backgound using ‘&’, its number, as well as the process b/ numbers of all its (top level) commands, is typed by the shell before prompting you for another command. For example, 4-46 Introduction to the C Shell % 1s —s |sort —n > usage & [2] 2034 2035 % runs the ‘Is’ program with the ‘—s’ options, pipes this output into the ‘sort’ program with the ‘—n’ option which puts its output into the file ‘usage’. Since the ‘&’ was at the end of the line, these two programs were started together as a background job. After starting the job, the shell prints the job number in brackets (2 in this case) followed by the process number of each program started in the job. Then the shell immediates prompts for a new command, leaving the job running simultaneously. As mentioned in section 1.8, foreground jobs become suspended by typing 1Z which sends a STOP signal to the currently running foreground job. suspended by using the stop command described below. A background job can become When jobs are suspended they merely stop any further progress until started again, either in the foreground or the back- gound. The shell notices when a job becomes stopped and reports this fact, much like it reports the termination of background jobs. For foreground jobs this looks like % du > usage 12 Stopped % ‘Stopped’ message is typed by the shell when it notices that the du program stopped. For background jobs, using the stop command, it is % sort usage & [1] 2345 % stop %1 [1] + Stopped (signal) sort usage % Suspending foreground jobs can be very useful when you need to temporarily change what you are doing (execute other commands) and then return to the suspended job. Also, foreground jobs can be suspended and then continued as background jobs using the bg command, allowing you to continue other work and stop waiting for the foreground job to finish. Thus % du > usage 17 Stopped % bg [1] du > usage & % starts ‘du’ in the foreground, stops it before it finishes, then continues it in the background allowing more foreground commands to be executed. This is especially helpful when a fore- ground job ends up taking longer than you expected and you wish you had started it in the backgound in the beginning. All job control commands can take an argument that identifies a particular job. All job name arguments begin with the character ‘%’, since some of the job control commands also accept process numbers (printed by the ps command.) The default job (when no argument is given) is called the current job and is identified by a ‘4’ in the output of the jobs command, which shows you which jobs you have. When only one job is stopped or running in the back- ground (the usual case) it is always the current job thus no argument is needed. If a job is stopped while running in the foreground it becomes the current job and the existing current job becomes the previous job — identified by a ‘=’ in the output of jobs. When the current job terminates, the previous job becomes the current job. When given, the argument is either ‘% —’ (indicating the previous job); ‘% #’, where # is the job number; ‘% pref’ where pref is Introduction to the C Shell 4-47 some unique prefix of the command name and arguments of one of the jobs; or ‘% ?’ followed by some string found in only one of the jobs. The jobs command types the table of jobs, giving the job number, commands and status (‘Stopped’ or ‘Running’) of each backgound or suspended job. With the ‘—1’ option the process numbers are also typed. % du > usage & [1] 3398 % 1s —s | sort —n > myfile & [2] 3405 % mail bill 17 Stopped % jobs [1] Running [2] Running [3] s Stopped % fg %ls Is —s | sort —n > myfile du > usage Is —s | sort —n > myfile mail bill % more myfile The fg command runs a suspended or background job in the foreground. It is used to restart a previously suspended job or change a background job to run in the foreground (allowing signals or input from the terminal). In the above example we used fg to change the ‘Is’ job from the background to the foreground since we wanted to wait for it to finish before looking at its output file. The bg command runs a suspended job in the background. It is usually used after stopping the currently running foreground job with the STOP signal. The combination of the STOP signal and the bg command changes a foreground job into a background job. The stop command suspends a background job. The kill command terminates a background or suspended job immediately. In addition to jobs, it may be given process numbers as arguments, as printed by ps. Thus, in the example above, the running du command could have been terminated by the command % kill %1 [1] Terminated du > usage % The notify command (not the variable mentioned earlier) indicates that the termination of a specific job should be reported at the time it finishes instead of waiting for the next prompt. If a job running in the background tries to read input from the terminal it is automatically stopped. When such a job is then run in the foreground, input can be given to the job. If desired, the job can be run in the background again until it requests input again. This is illustrated in the following sequence where the ‘s’ command in the text editor might take a long time. % ed bigfile 120000 1,$s/thisword/thatword/ 17 Stopped % bg [1] ed bigfile & % . .. some foreground commands [1] Stopped (tty input) ed bigfile 4-48 Introduction to the C Shell % fg ed bigfile W 120000 q %0 So after the ‘s’ command was issued, the ‘ed’ job was stopped with {Z and then put in the background using bg. Some time later when the ‘s’ command was finished, ed tried to read another command and was stopped because jobs in the backgound cannot read from the terminal. The f¢ command returned the ‘ed’ job to the foreground where it could once again accept commands from the terminal. The command stty tostop causes all background jobs run on your terminal to stop when they are about to write output to the terminal. This prevents messages from background jobs from interrupting foreground job output and allows you to run a job in the background without losing terminal output. It also can be used for interactive programs that sometimes have long periods without interac- tion. Thus each time it outputs a prompt for more input it will stop before the prompt. It can then be run in the foreground using fg, more input can be given and, if necessary stopped and returned to the background. This stty command might be a good thing to put in your login file if you do not like output from background jobs interrupting your work. It also can reduce the need for redirecting the output of background jobs if the output is not very big: % stty tostop % wc hugefile & [1] 10387 % ed text . . . some time later q [1] Stopped (tty output) wc hugefile % fg wc wc hugefile 13371 30123 302577 % stty —tostop Thus after some time the ‘we¢’ command, which counts the lines, words and characters in a When it tried to write this to the terminal it stopped. By restart- file, had one line of output. ing it in the foreground we allowed it to write on the terminal exactly when we were ready to look at its output. Programs which attempt to change the mode of the terminal will also block, whether or not tostop is set, when they are not in the foreground, as it would be very unpleasant to have a background job change the state of the terminal. Since the jobs command only prints jobs started in the currently executing shell, it knows nothing about background jobs started in other login sessions or within shell files. The ps can be used in this case to find out about background jobs not started in the current shell. 2.7. Working Directories As mentioned in section 1.6, the shell is always in a particular working directory. The ‘change directory’ command chdir (its short form ¢d may also be used) changes the working directory of the shell, that is, changes the directory you are located in. It is useful to make a directory for each project you wish to work on and to place all files related to that project in that directory. The ‘make directory’ command, mkdir, creates a new The pwd (‘print working directory’) command reports the absolute pathname of the working directory of the shell, that is, the directory you are located in. Thus in the directory. Introduction to the C Shell 4-49 example below: % pwd fusr/bill % mkdir newpaper % chdir newpaper % pwd /usr/bill/newpaper % the user has created and moved to the directory newpaper. where, for example, he might place a group of related files. No matter where you have moved to in a directory hierarchy, you can return to your ‘home’ login directory by doing just cd with no arguments. The name ‘..’ always means the directory above the current one in the hierarchy, thus cd .. changes the shell’s working directory to the one directly above the current one. The name .’ can be used in any pathname, thus, cd ../programs means change to the directory ‘programs’ contained in the directory above the current one. If you have several directories for different projects under, say, your home directory, this shorthand notation permits you to switch easily between them. The shell always remembers the pathname of its current working directory in the vari- able cwd. The shell can also be requested to remember the previous directory when you change to a new working directory. If the ‘push directory’ command pushd is used in place of the ¢d command, the shell saves the name of the current working directory on a directory stack before changing to the new one. You can see this list at any time by typing the ‘direc- tories’ command dirs. o % pushd newpaper/references “/newpaper/references ~ % pushd /usr/lib/tmac /usr/lib/tmac ”/newpaper/references - % dirs /usr/lib/tmac ~/newpaper/references ~ % popd “/newpaper/references ~ % popd % The list is printed in a horizontal line, reading left to right, with a tilde (7) as shorthand for your home directory—in this case ‘/usr/bill’. The directory stack is printed whenever there is more than one entry on it and it changes. Itis also printed by a dirs command. Dirs is usu- ally faster and more informative than pwd since it shows the current working directory as well as any other directories remembered in the stack. The pushd command with no argument alternates the current directory with the first directory in the list. The ‘pop directory’ popd command without an argument returns you to the directory you were in prior to the current one, discarding the previous current directory from the stack (forgetting it). vad VRAS AN \ o bt o Typing popd 22y several times in a series takes you backward N e - . (€ [SA VS A Lg VIR AR WINS through the dlrectorles you had been in (changed to) by pushd command. There are other 4-50 Introduction to the C Shell options to pushd and popd to manipulate the contents of the directory stack and to change to directories not at the top of the stack; see the csh manual page for details. Since the shell remembers the working directory in which each job was started, it warns you when you might be confused by restarting a job in the foreground which has a different working directory than the current working directory of the shell. Thus if you start a back- ground job, then change the shell’s working directory and then cause the background job to run in the foreground, the shell warns you that the working directory of the currently running foreground job is different from that of the shell. % dirs —1 /mnt/bill % cd myproject % dirs “/myproject % ed prog.c 1143 17 Stopped % cd .. % ls myproject textfile % fg ed prog.c (wd: “/myproject) This way the shell warns you when there is an implied change of working directory, even though no c¢d command was issued. In the above example the ‘ed’ job was still in ‘/mnt/bill/project’ even though the shell had changed to ‘/mnt/bill’. A similar warning is given when such a foreground job terminates or is suspended (using the STOP signal) since the return to the shell again implies a change of working directory. % fg ed prog.c (wd: “/myproject) . . . after some editing q (wd now: ") % These messages are sometimes confusing if you use programs that change their own working directories, since the shell only remembers which directory a job is started in, and assumes it stays there. The ‘—1’ option of jobs will type the working directory of suspended or back- ground jobs when it is different from the current working directory of the shell. 2.8. Useful built-in commands We now give a few of the useful built-in commands of the shell describing how they are used. The alias command described above is used to assign new aliases and to show the existing aliases. With no arguments it prints the current aliases. It may also be given only one argument such as alias lIs to show the current alias for, e.g., ‘Is’. The echo command prints its arguments. It is often used in shell scripts or as an interactive command to see what filename expansions will produce. Introduction to the C Shell 4-51 The history command will show the contents of the history list. The numbers given with the history events can be used to reference previous events which are difficult to refer- ence using the contextual mechanisms introduced above. There is also a shell variable called prompt. By placing a ‘I’ character in its value the shell will there substitute the number of the current command in the history list. history substitution. You can use this number to refer to this command in a Thus you could set prompt="\! % ~ Note that the ‘I’ character had to be escaped here even within ¢” characters. The limit command is used to restrict use of resources. With no arguments it prints the current limitations: cputime filesize unlimited unlimited datasize 5616 kbytes stacksize 512 kbytes coredumpsize unlimited Limits can be set, e.g.: limit coredumpsize 128k Most reasonable units abbreviations will work; see the csh manual page for more details. The logout command can be used to terminate a login shell which has ignoreeof set. The rehash command causes the shell to recompute a table of where commands are located. This is necessary if you add a command to a directory in the current shell’s search path and wish the shell to find it, since otherwise the hashing algorithm may tell the shell that the command wasn’t in that directory when the hash table was computed. The repeat command can be used to repeat a command several times. copies of the file one in the file five you could do Thus to make 5 | repeat 5 cat one >> five The setenv command can be used to set variables in the environment. Thus setenv TERM adm3a will set the value of the environment variable TERM to ‘adm3a’. exists which will print out the environment. A user program printenv It might then show: % printenv HOME=/usr/bill SHELL=/bin/csh PATH-=:/usr/uchb:/bin:/usr/bin:/usr/local TERM=adm3a USER=hbill % The source command can be used to force the current shell to read commands from a file. Thus source .cshre can be used after editing in a change to the .cshrc file which you wish to take effect before the next time you login. The time command can be used to cause a command to be timed no matter how much CPU time it takes. Thus 4-52 Introduction to the C Shell % time cp /etc/rc /usr/bill/rc 0.0u 0.1s 0:01 8% 2+1k 3+2io 1pf+0w % time wc /etc/rc /usr/bill/rc 52 52 178 178 1347 /etc/rc 1347 /usr/bill/rc 104 356 2694 total 0.1u 0.1s 0:00 13% 3+3k 5+3io 7Tpf+0w % indicates that the ¢p command used a negligible amount of user time (u) and about 1/10th of a system time (s); the elapsed time was 1 second (0:01), there was an average memory usage of 2k bytes of program space and 1k bytes of data space over the cpu time involved (2+1k); the program did three disk reads and two disk writes (3+2i0), and took one page fault and was not swapped (1pf+0w). The word count command wc on the other hand used 0.1 seconds of user time and 0.1 seconds of system time in less than a second of elapsed time. The percen- tage ‘13%’ indicates that over the period when it was active the command ‘wc¢’ used an average of 13 percent of the available CPU cycles of the machine. The unalias and unset commands can be used to remove aliases and variable definitions from the shell, and unsetenv removes variables from the environment. 2.9. What else? This concludes the basic discussion of the shell for terminal users. There are more features of the shell to be discussed here, and all features of the shell are discussed in its manual pages. One useful feature which is discussed later is the foreach built-in command which can be used to run the same command sequence with a number of different arguments. If you intend to use UNIX a lot you you should look through the rest of this document and the shell manual pages to become familiar with the other facilities which are available to youl. Introduction to the C Shell 4-53 3. Shell control structures and command scripts 3.1. Introduction It is possible to place commands in files and to cause shells to be invoked to read and execute commands from these files, which are called shell scripts. We here detail those features of the shell useful to the writers of such scripts. 3..2. Make It is important to first note what shell scripts are not useful for. There is a program called make which is very useful for maintaining a group of related files or performing sets of operations on related files. For instance a large program consisting of one or more files can have its dependencies described in a makefile which contains definitions of the commands used to create these different files when changes occur. Definitions of the means for printing listings, cleaning up the directory in which the files reside, and installing the resultant pro- grams are easily, and most appropriately placed in this makefile. This format is superior and preferable to maintaining a group of shell procedures to maintain these files. Similarly when working on a document a makefile may be created which defines how different versions of the document are to be created and which options of nroff or troff are appropriate. 3.3. Invocation and the argv variable A csh command script may be interpreted by saying % csh script ... where script is the name of the file containing a group of csh commands and ‘...’ is replaced by a sequence of arguments. The shell places these arguments in the variable argv and then begins to read commands from the script. These parameters are then available through the same mechanisms which are used to reference any other shell variables. If you make the file ‘script’ executable by doing chmod 755 script and place a shell comment at the beginning of the shell script (i.e. begin the file with a ‘#’ character) then a ‘/bin/csh’ will automatically be invoked to execute ‘script’ when you type script If the file does not begin with a ‘#’ then the standard shell ‘/bin/sh’ will be used to execute it. This allows you to convert your older shell scripts to use csh at your convenience. 3.4. Variable substitution After each input line is broken into words and history substitutions are done on it, the input line is parsed into distinct commands. Before each command is executed a mechanism know as variable substitution is done on these words. Keyed by the character ‘$’ this substi- tution replaces the names of variables by their values. Thus when placed in a command script would cause the current value of the variable argv to be echoed to the output of the shell script. It is an error for argv to be unset at this point. A number of notations are provided for accessing components and attributes of variables. The notation | $7name expands to ‘1’ if name is set or to ‘0’ if name is not set. It is the fundamental mechanism used for checking whether particular variables have been assigned values. All other forms of 4-54 Introduction to the C Shell reference to undefined variables cause errors. The notation $#name expands to the number of elements in the variable name. Thus % set argv=(a b ¢) % echo $?argv 1 % echo $#argv 3 % unset argv % echo $?argv 0 % echo $argv Undefined variable: argv. % It is also possible to access the components of a variable which has several values. Thus $argv[1] gives the first component of argv or in the example above ‘a’. Similarly $argv[$Hargv] would give ‘c’, and $argv[1—2] would give ‘a b’. Other notations useful in shell scripts are $n where n is an integer as a shorthand for $argv[n] the nth parameter and g which is a shorthand for $argv The form $& expands to the process number of the current shell. Since this process number is unique in the system it can be used in generation of unique temporary file names. The form $< is quite special and is replaced by the next line of input read from the shell’s st anaara imput (not the script it is reading). This is useful for writing shell scripts that are interactive, reading commands from the terminal, or even writing a shell script that acts as a filter, reading PR RPN .S4 lines from its input file. Thus the sequence echo ’yes or no?\c’ set a=($<) would write out the prompt ‘yes or no?’ without a newline and then read the answer into the variable ‘a’. In this case ‘$#a’ would be ‘0’ if either a blank line or end-of-file (1D) was typed. Introduction to the C Shell 4-55 One minor difference between ‘$n’ and ‘$argv[n]” should be noted here. The form ‘$argv[n]’ will yield an error if n is not in the range ‘1—$#argv’ while ‘$n’ will never yield an out of range subscript error. This is for compatibility with the way older shells handled parameters. Another important point is that it is never an error to give a subrange of the form ‘n—’ if there are less than n components of the given variable then no words are substituted. A range of the form ‘m—n’ likewise returns an empty vector without giving an error when m exceeds the number of elements of the given variable, provided the subscript n is in range. 3.5. Expressions In order for interesting shell scripts to be constructed it must be possible to evaluate expressions in the shell based on the values of variables. In fact, all the arithmetic operations of the language C are available in the shell with the same precedence that they have in C. In particular, the operations ‘==’ and ‘!=" compare strings and the operators ‘&&’ and il imple- ment the boolean and/or operations. The special operators ‘="" and ‘! are similar to ‘==’ and ‘I=> except that the string on the right side can have pattern matching characters (like *, 7 or [1) and the test is whether the string on the left matches the pattern on the right. The shell also allows file enquiries of the form —? filename where ‘?’ is replace by a number of single characters. For instance the expression primitive —e filename tell whether the file ‘filename’ exists. Other primitives test for read, write and execute access to the file, whether it is a directory, or has non-zero length. It is possible to test whether a command terminates normally, by a primitive of the form ‘{ command }’ which returns true, i.e. ‘1’ if the command succeeds exiting normally with exit status 0, or ‘0’ if the command terminates abnormally or with exit status non-zero. If more detailed information about the execution status of a command is required, it can be executed and the variable ‘$status’ examined in the next command. Since ‘$status’ is set by every command, it is very transient. It can be saved if it is inconvenient to use it only in the single immediately following command. For a full list of expression components available see the manual section for the shell. 3.6. Sample shell script A sample shell script which makes use of the expression mechanism of the shell and some of its control structure follows: 4-56 Introduction to the C Shell % cat copyc # # Copyc copies those C programs in the specified list # to the directory “/backup if they differ from the files # already in “/backup H# set noglob foreach i ($argv) if ($1!” *.c) continue # not a .c file so do nothing if (! —r “/backup/$i:t) then echo $i:t not in backup... not cpxXed continue endif cmp —s $i “/backup/$i:t # to set $status if ($status != 0) then echo new backup of $i cp $i “/backup/$i:t endif end This script makes use of the foreach command, which causes the shell to execute the commands between the foreach and the matching end for each of the values given between ‘(’ and ‘)’ with the named variable, in this case ‘i’ set to successive values in the list. Within this loop we may use the command break to stop executing the loop and continue to prematurely terminate one iteration and begin the next. After the foreach loop the iteration variable (i in this case) has the value at the last iteration. We set the variable noglob here to prevent filename expansion of the members of argv. This is a good idea, in general, if the arguments to a shell script are filenames which have already been expanded or if the arguments may contain filename expansion metacharacters. It is also possible to quote each use of a ‘$’ variable expansion, but this is harder and less reli- able. , The other control construct used here is a statement of the form if ( expression ) then command endif The placement of the keywords here is not flexible due to the current implementation of the shell.T TThe following two formats are not currently acceptable to the shell: if ( expression ) # Won’t work! then command endif and if ( AL Y . eAyr '@ ) WVERAS /AL LUULIIAIIQAAIIA .l.l.\all..l Introduction to the C Shell 4-57 The shell does have another form of the if statement of the form if ( expression ) command which can be written if ( expression ) \ command Here we have escaped the newline for the sake of appearance. The command must not involve ‘I’, ‘&’ or ‘;’ and must not be another control command. The second form requires the final ‘\ to immediately precede the end-of-line. The more general if statements above also admit a sequence of else—if pairs followed by a single else and an endif, e.g.: if ( expression ) then commands else if (expression ) then commands else commands endif Another important mechanism used in shell scripts is the ¢’ modifier. We can use the modifier “:r’ here to extract a root of a filename or ‘e’ to extract the extension. Thus if the variable ¢ has the value ‘/mnt/foo.bar’ then ’ % echo $i $i:r $ice /mnt/foo.bar /mnt/foo bar % shows how the “r’ modifier strips off the trailing ‘.bar’ and the the ‘¢’ modifier leaves only the ‘bar’. Other modifiers will take off the last component of a pathname leaving the head :h’ or all but the last component of a pathname leaving the tail ‘t’. These modifiers are fully described in the csh manual pages in the programmers manual. It is also possible to use the command substitution mechanism described in the next major modifications on strings to then reenter the shells environment. section to perform Since each usage of this mechanism involves the creation of a new process, it is much more expensive to use than the ‘> modification mechanism.## Finally, we note that the character ‘#’ lexically introduces a shell comment in shell scripts (but not from the terminal). All subsequent characters on the input line after a ‘#’ are discarded by the shell. This character can be quoted using ‘’ or ‘\ to place it in an argument word. 3.7. Other control structures The shell also has control structures while and switch similar to those of C. These take the P forms E-% -3 #1It is also important to note that the current implementation of the shell limits the number of ‘:” modifiers on a ‘$ substitution to 1. Thus % echo $i $i:h:t /a/b/c /a/b:t % does not do what one would expect. 4-58 Introduction to the C Shell while ( expression ) commands end and switch ( word ) case strl: commands breaksw case strn: commands breaksw default: commands breaksw endsw For details see the manual section for csh. C programmers should note that we use breaksw to exit from a switch while break exits a while or foreach loop. A common mistake to make in csh scripts is to use break rather than breaksw in switches. Finally, csh allows a goto statement, with labels looking like they do in C, i.e.: loop: commands goto loop 3.8. Supplying input to commands Commands run from shell scripts receive by default the standard input of the shell which is running the script. This is different from previous shells running under UNIX. It allows shell scripts to fully participate in pipelines, but mandates extra notation for commands which are to take inline data. Thus we need a metanotation for supplying inline data to commands in shell scripts. As an example, consider this script which runs the editor to delete leading blanks from the lines in each argument file | % cat deblank # deblank —— remove leading blanks foreach i ($argv) ed — $i << 'EOF’ 1,8s/1 1%// W q "‘EOF’ end % The notation ‘<< "EOF” means that the standard input for the ed command is to come from the text in the shell script file up to the next line consisting of exactly “EOF”. The fact that the ‘EOF’ is enclosed in ‘” characters, i.e. quoted, causes the shell to not perform wvariable Introduction to the C Shell 4-59 substitution on the intervening lines. In general, if any part of the word following the ‘<<’ which the shell uses to terminate the text to be given to the command is quoted then these substitutions will not be performed. In this case since we used the form ‘1,$’ in our editor script we needed to insure that this ‘$’ was not variable substituted. We could also have insured this by preceding the ‘$’ here with a ‘\’, i.e.: L\$s/A[ 1*// but quoting the ‘EOF’ terminator is a more reliable way of achieving the same thing. 3.9.#Catching interrupts If our shell script creates temporary files, we may wish to catch interruptions of the shell script so that we can clean up these files. We can then do onintr label where label is a label in our program. If an interrupt is received the shell will do a ‘goto label’ and we can remcve the temporary files and then do an exit command (which is built in to the shell) to exit from the shell script. If we wish to exit with a non-zero status we can do exit(1) e.g. to exit with status ‘1’. 3.10. What else? There are other features of the shell useful to writers of shell procedures. The verbose and echo options and the related —v and —x command line options can be used to help trace the actions of the shell. The —n option causes the shell only to read commands and not to execute them and may sometimes be of use. One other thing to note is that csh will not execute shell scripts which do not begin with the character ‘#’, that is shell scripts that do not begin with a comment. Similarly, the ‘/bin/sh’ on your system may well defer to ‘csh’ to interpret shell scripts which begin with ‘#’. This allows shell scripts for both shells to live in harmony. There is also another quotation mechanism using which allows only some of the expansion mechanisms we have so far discussed to occur on the quoted string and serves to €9y make this string into a single word as ” does. 4-60 Introduction to the C Shell 4. Other, less commonly used, shell features 4.1. Loops at the terminal; variables as vectors It is occasionally useful to use the foreach control structure at the terminal to aid in performing a number of similar commands. For instance, there were at one point three shells in use on the Cory UNIX system at Cory Hall, ‘/bin/sh’, ‘/bin/nsh’, and ‘/bin/csh’. To count the number of persons using each shell one could have issued the commands % grep —c csh$ /etc/passwd 27 % grep —c nsh$ /etc/passwd 128 | % grep —c —v sh$ /etc/passwd 430 % Since these commands are very similar we can use foreach to do this more easily. % foreach i ("sh$” ‘csh$” "—v sh$) ? grep —c $i /etc/passwd ? end 27 128 430 % Note here that the shell prompts for input with ‘? * when reading the body of the loop. Very useful with loops are variables which contain lists of filenames or other words. You can, for example, do % set a=(ls’) % echo $a csh.n csh.rm % s csh.n csh.rm % echo $#a 2 % The set command here gave the variable a a list of all the filenames in the current directory as value. We can then iterate over these names to perform any chosen function. The output of a command within © characters is converted by the shell to a list of words. You can also place the *’ quoted string within “”’ characters to take each (non-empty) line as a component of the variable; preventing the lines from being split into words at blanks and tabs. A modifier “:x’ exists which can be used later to expand each component of the vari- able into another variable splitting it into separate words at embedded blanks and tabs. 4.2. Braces { ... }| in argument expansion Another form of filename expansion, alluded to before involves the characters ‘{’ and ‘} . These characters specify that the contained strings, separated by ‘,” are to be consecutively substituted into the containing characters and the results expanded left to right. Thus A{strl,str2,..strn}B expands to Introduction to the C Shell 4-61 Astr1B Astr2B ... AstrnB This expansion occurs before the other filename expansions, and may be applied recursively (i.e. nested). The results of each expanded string are sorted separately, left to right order being preserved. The resulting filenames are not required to exist if no other expansion mechanisms are used. This means that this mechanism can be used to generate arguments which are not filenames, but which have common parts. A typical use of this would be mkdir ~/{hdrs,retrofit,csh} to make subdirectories ‘hdrs’, ‘retrofit’ and ‘csh’ in your home directory. This mechanism is most useful when the common prefix is longer than in this example, i.e. chown root /usr/{ucb/{ex,edit},lib/{ex?.?* how ex}} 4.3. Command substitution A command enclosed in ¢’ characters is replaced, just before filenames are expanded, by the output from that command. Thus it is possible to do set pwd="pwd" to save the current directory in the variable pwd or to do ex grep —1 TRACE *.c’ to run the editor ex supplying as arguments those files whose names end in ‘.c’ which have the string “TRACE’ in them.* 4.4. Other details not covered here In particular circumstances it may be necessary to know the exact nature and order of different substitutions performed by the shell. The exact meaning of certain combinations of quotations is also occasionally important. These are detailed fully in its manual section. The shell has a number of command line option flags mostly of use in writing UNIX programs, and debugging shell scripts. See the shells manual section for a list of these options. *Command expansion also occurs in input redirected with ‘<<’ and within “’ quotations. Refer to the shell manual section for full details. 4-62 Introduction to the C Shell Appendix — Special characters The following table lists the special characters of csh and the UNIX system, giving for each the section(s) in which it is discussed. expressions. A number of these characters also have special meaning in See the csh manual section for a complete list. Syntactic metacharacters : | 2.4 separates commands to be executed sequentially 1.5 separates commands in a pipeline () 2.23.6 & 2.5 brackets expressions and variable values follows commands to be executed without waiting for completion Filename metacharacters 1.6 ] {} separates C()mponenfs of a file’s pathname 1.6 expansion character matching any single character 1.6 expansion character matching any sequence of characters 1.6 expansion sequence matching any single character from a set 1.6 used at the beginning of a filename to indicate home directories 4.2 used to specify groups of arguments with common parts Quotation metacharacters \ 1.7 ’ 1.7 prevents meta-meaning of following single character prevents meta-meaning of a group of characters ? 4.3 like ’, but allows variable and command expansion Input/output metacharacters < 1.5 indicates redirected input > 1.3 indicates redirected output Expansion/substitution metacharacters $ 3.4 ! 2.3 indicates variable substitution indicates history substitution 3.6 precedes substitution modifiers 1 2.3 used in special forms of history substitution ) 4.3 indicates command substitution Other metacharacters H 1.3,3.6 — 1.2 prefixes option (flag) arguments to commands begins scratch file names; indicates shell comments % 2.6 prefixes job name specifications Introduction to the C Shell 4-63 Glossary This glossary lists the most important terms introduced in the introduction to the shell and gives references to sections of the shell document for further information about them. References of the form ‘pr (1)’ indicate that the command pr is in the UNIX programmer’s manual in section 1. You can get an online copy of its manual page by doing man 1 pr References of the form (2.5) indicate that more information can be found in section 2.5 of this manual. Your current directory has the name ‘.’ as well as the name printed by the command pwd; see also dirs. The current directory ‘.’ is usually the first component of the search path contained in the variable path, thus commands which are in ‘" are found first (2.2). The character ‘.’ is also used in separating components of filenames (1.6). The character ‘.’ at the beginning of a component of a pathname is treated specially and not matched by the filename expansion metacharacters ‘?’, ‘*’, and ‘[’ ‘]’ pairs (1.6). Each directory has a file ‘..” in it which is a reference to its parent directory. After changing into the directory with chdir, i.e. ~ chdir paper you can return to the parent directory by doing chdir .. The current directory is printed by pwd (2.7). a.out Compilers which create executable images create them, by default, in the file a.out. for historical reasons (2.3). absolute pathname A pathname which begins with a ‘/’ is absolute since it specifies the path of directories from the beginning of the entire directory system — called the root directory. Pathnames which are not absolute are called relative (see definition of relative pathname) (1.6). alias An alias specifies a shorter or different name for a UNIX command, or a transformation on a command to be performed in the shell. The shell has a command alias which establishes aliases and can print their current values. The command unalias is used to remove aliases (2.4). argument Commands in UNIX receive a list of argument words. Thus the command echoabc consists of the command name ‘echo’ and three argument words ‘a’, ‘b’ and ‘’. The set of arguments after the command name is said to be the argument list of the command (1.1). argv The list of arguments to a command written in the shell language (a shell script or shell procedure) is stored in a variable called argv within the shell. This name is taken from the conventional name in the C programming language (3.4). background base Commands started without waiting for them to complete are called background commands (2.6). A filename is sometimes thought of as consisting of a base part, before any ‘.’ character, and an extension — the part after the ‘.’. See filename and exten¢ sion (1.6) 2 4-64 Introduction to the C Shell bg bin The bg command causes a suspended job to continue execution in the background (2.6). A directory containing binaries of programs and shell scripts to be executed is typically called a bin directory. The standard system bin directories are ‘/bin’ containing the most heavily used commands and ‘/usr/bin’ which contains most other user programs. Programs developed at UC Berkeley live in ‘/usr/ucb’, while locally written programs live in ‘/usr/local’. Games are kept in the directory ‘/usr/games’. You can place binaries in any directory. If you wish to execute them often, the name of the directories should be a component of the variable path. break Break is a builtin command used to exit from loops within the control struc- ture of the shell (3.7). breaksw The breaksw builtin command is used to exit from a switch control structure, like a break exits from loops (3.7). builtin A command executed directly by the shell is called a builtin command. Most commands in UNIX are not built into the shell, but rather exist as files in bin directories. These commands are accessible because the directories in which they reside are named in the path variable. case A case command is used as a label in a switch statement in the shell’s control structure, similar to that of the language C. Details are given in the shell documentation ‘csh(1)’ (3.7). cat The cat program catenates a list of specified files on the standard output. It is usually used to look at the contents of a single file on the terminal, to ‘cat a file’ (1.8, 2.3). cd | The c¢d command is used to change the working directory. With no arguments, cd changes your working directory to be your home directory (2.4, 2.7). chdir The chdir command is a synonym for c¢d. Cd is usually used because it is easier to type. chsh The chsh command is used to change the shell which you use on UNIX. By default, you use an different version of the shell which resides in ‘/bin/sh’. You can change your shell to ‘/bin/csh’ by doing chsh your-login-name /bin/csh Thus I would do chsh bill /bin/csh It is only necessary to do this once. The next time you log in to UNIX after doing this command, you will be using csh rather than the shell in ‘/bin/sh’ (1.9). cmp Cmp is a program which compares files. to see if two files are identical (3.6). It is usually used on binary files, or For comparing text files the program diff, described in ‘diff (1)’ is used. a WS Rl A command B RS W A Aaa L - N AN A function performed by the system, either by the shell (a builtin command) or by a program residing in a file in a directory within the UNIX system, is called a command (1.1). command name When a command is issued, it consists of a command name, which is the first word of the command, followed by arguments. The convention on UNIX is that the first word of a command names the function to be performed (1.1). Introduction to the C Shell 4-65 command substitution The replacement of a command enclosed in ©’ characters by the text output by that command is called command substitution (4.3). €89 component A part of a pathname between ‘/’ characters is called a component of that pathname. A variable which has multiple strings as value is said to have several components; each string is a component of the variable. continue A builtin command which causes execution of the enclosing foreach or while loop to cycle prematurely. Similar to the continue command in the programming language C (3.6). control- Certain special characters, called control characters, are produced by holding down the CONTROL key on your terminal and simultaneously pressing another character, much like the SHIFT key is used to produce upper case characters. Thus control-c¢ is produced by holding down the CONTROL key while pressing the ‘¢’ key. Usually UNIX prints an up-arrow (f) followed by the corresponding letter when you type a control character (e.g. ‘)C’ for control-c (1.8). core dump When a program terminates abnormally, the system places an image of its current state in a file named ‘core’. This core dump can be examined with the system debugger ‘adb(1)’ or ‘sdb(1)’ in order to determine what went wrong with the program (1.8). If the shell produces a message of the form [llegal instruction (core dumped) (where °‘Illegal instruction’ is only one of several possible messages), you should report this to the author of the program or a system administrator, saving the ‘core’ file. cp The cp (copy) program is used to copy the contents of one file into another file. It is one of the most commonly used UNIX commands (1.6). csh The name of the shell program that this document describes. .cshre The file .cshrc in your home directory is read by each shell as it begins execution. It is usually used to change the setting of the variable path and to set alias parameters which are to take effect globally (2.1). cwd The cwd variable in the shell holds the absolute pathname of the current working directory. It is changed by the shell whenever your current working directory changes and should not be changed otherwise (2.2). date The date command prints the current date and time (1.3). debugging Debugging is the process of correcting mistakes in programs and shell scripts. The shell has several options and variables which may be used to aid in shell debugging (4.4). default: The label default: is used within shell switch statements, as it is in the C language to label the code to be executed if none of the case labels matches the value switched on (3.7). DELETE The DELETE or RUBOUT key on the terminal normally causes an interrupt to be sent to the current job. Many users change the interrupt character to be fC. detached A command that continues running in the background after you logout is said to be detached. diagnostic An error message produced by a program is often referred to as a diagnostic. Most error messages are not written to the standard output, since that is often directed away from the terminal (1.3, 1.5). Error messsages are instead written to the diagnostic output which may be directed away from the terminal, but usually is not. Thus diagnostics will usually appear on the terminal (2.5). 4-66 Introduction to the C Shell directory A structure which contains files. At any time you are in one particular direc- tory whose names can be printed by the command pwd. The chdir command will change you to another directory, and make the files in that directory visible. The directory in which you are when you first login is your home directory (1.1, 2.7). directory stack The shell saves the names of previous working directories in the directory stack when you change your current working directory via the pushd com- mand. The directory stack can be printed by using the dirs command, which includes your current working directory as the first directory name on the left (2.7). dirs The dirs command prints the shell’s directory stack (2.7). du The du command is a program (described in ‘du(1)’) which prints the number of disk blocks is all directories below and including your current working directory (2.6). echo The echo command prints its arguments (1.6, 3.6). else The else command is part of the ‘if-then-else-endif’ control command con- struct (3.6). endif If an if statement is ended with the word then, all lines following the if up to a line starting with the word endif or else are executed if the condition between parentheses after the if is true (3.6). EOF An end-of-file is generated by the terminal by a control-d, and whenever a command reads to the end of a file which it has been given as input. Com- mands receiving input from a pipe receive an end-of-file when the command sending them input completes. an end-of-file. Most commands terminate when they receive The shell has an option to ignore end-of-file from a terminal input which may help you keep from logging out accidentally by typing too many control-d’s (1.1, 1.8, 3.8). escape A character ¢\’ used to prevent the special meaning of a metacharacter is said to escape the character from its special meaning. Thus echo \* will echo the character ‘*’ while just echo * will echo the names of the file in the current directory. escapes ‘*’ (1.7). In this example, x There is also a non-printing character called escape, usually labelled ESC or ALTMODE on terminal keyboards. Some older UNIX systems use this character to indicate that output is to be suspended. Most systems use control-s to stop the output and control-q to start it. /etc/passwd This file contains information about the accounts currently on the system. It consists of a line for each account with fields separated by ‘.’ characters (1.8). You can look at this file by saying cat /etc/passwd The commands finger and grep are often used to search for information in this file. exit See ‘finger(1)’, ‘passwd(5)’, and ‘grep(1)’ for more details. The exit command is used to force termination of a shell script, and is built into the shell (3.9). exit status A command which discovers a problem may reflect this back to the command (such as a shell) which invoked (executed) it. It does this by returning a non-zero number as its exit status, a status of zero being considered ‘normal Introduction to the C Shell 4-67 termination’. The exit command can be used to force a shell command script to give a non-zero exit status (3.6). expansion The replacement of strings in the shell input which contain metacharacters by other strings is referred to as the process of expansion. Thus the replacement of the word ‘*’ by a sorted list of files in the current directory is a ‘filename expansion’. Similarly the replacement of the characters ‘!’ by the text of the last command is a ‘history expansion’. FExpansions are also referred to as substitutions (1.6, 3.4, 4.2). expressions Expressions are used in the shell to control the conditional structures used in the writing of shell scripts and in calculating values for these scripts. The operators available in shell expressions are those of the language C (3.5). extension Filenames often consist of a base name and an extension separated by the character ‘.”. By convention, groups of related files often share the same root name. Thus if ‘prog.c’ were a C program, then the object file for this program would be stored in ‘prog.o’. Similarly a paper written with the ‘—me’ nroff macro package might be stored in ‘paper.me’ while a formatted version of this paper might be kept in ‘paper.out’ and a list of spelling errors in ‘paper.errs’ (1.6). fg The job control command fg is used to run a background or suspended job filename Each file in UNIX has a name consisting of up to 14 characters and not including the character ‘/> which is used in pathname building. Most filenames do not begin with the character ‘.’, and contain only letters and digits with perhaps a ‘.’ separating the base portion of the filename from an extension in the foreground (1.8, 2.6). (1.6). filename expansion Filename expansion uses the metacharacters ‘*’, ‘?” and ‘[’ and ‘]’ to provide a convenient mechanism for naming files. Using filename expansion it is easy to name all the files in the current directory, or all files which have a common root name. Other filename expansion mechanisms use the metacharacter ~ and allow files in other users’ directories to be named easily (1.6, 4.2). flag Many UNIX commands accept arguments which are not the names of files or other users but are used to modify the action of the commands. These are referred to as flag options, and by convention consist of one or more letters preceded by the character ‘=’ (1.2). Thus the Is (list files) command has an option ‘—s’ to list the sizes of files. This is specified Is —s foreach The foreach command is used in shell scripts and at the terminal to specify repetition of a sequence of commands while the value of a certain shell variable ranges through a specified list (3.6, 4.1). foreground When commands are executing in the normal way such that the shell is waiting for them to finish before prompting for another command they are said to be foreground jobs or running in the foreground. This is as opposed to background. Foreground jobs can be stopped by signals from the terminal caused by typing different control characters at the keyboard (1.8, 2.6). The shell has a command goto used in shell scripts to transfer control to a given label (3.7). The grep command searches through a list of argument files for a specified string. Thus 4-68 Introduction to the C Shell grep bill /etc/passwd will print each line in the file /etc/passwd which contains the string ‘bill’. Actually, grep scans for regular expressions in the sense of the editors ‘ed(1)’ and ‘ex(1)’. head Grep stands for ‘globally find regular expression and print’ (2.4). The head command prints the first few lines of one or more files. If you have a bunch of files containing text which you are wondering about it is sometimes useful to run head with these files as arguments. This will usually show enough of what is in these files to let you decide which you are interested in (1.5). Head 1is also used to describe the part of a pathname before and including the last ‘/’ character. The tail of a pathname is the part after the last ‘/’. The “h’ and “t’ modifiers allow the head or tail of a pathname stored in a shell variable to be used (3.6). history The history mechanism of the shell allows previous commands to be repeated, possibly after modification to correct typing mistakes or to change the meaning of the command. The shell has a history list where these com- mands are kept, and a history variable which controls how large this list is (2.3). home directory Each user has a home directory, which is given in your entry in the password file, /etc/passwd. This is the directory which you are placed in when you first login. The c¢d or chdir command with no arguments takes you back to this directory, whose name is recorded in the shell variable home. access the home directories You can also of other users in forming filenames using a filename expansion notation and the character < (1.6). if A conditional command within the shell, the if command is used in shell com- mand scripts to make decisions about what course of action to take next (3.6). ignoreeof Normally, your shell will exit, printing ‘logout’ if you type a control-d at a prompt of ‘% ’. This is the way you usually log off the system. You can set the ignoreeof variable if you wish in your .login file and then use the command logout to logout. This is useful if you sometimes accidentally type too many control-d characters, logging yourself off (2.2). input Many commands on UNIX take information from the terminal or from files which they then act on. This information is called input. Commands nor- mally read for input from their standard input which is, by default, the terminal. This standard input can be redirected from a file using a shell metanotation with the character ‘<’. file specified as argument. Many commands will also read from a Commands placed in pipelines will read from the output of the previous command in the pipeline. The leftmost command in a pipeline reads from the terminal if you neither redirect its input nor give it a filename to use as standard input. Special mechanisms exist for supplying input to commands in shell scripts (1.5, 3.8). Nv\n] 4+ A MO Ln P Vot o Wa' An uwcliu,po is a osignai to a program thatt is gen erated b 4—4— "AN o~ hTTnf\TTm 1trin g he RUBOUT or DELETE key (although users can and often do change the interrupt charac- ter, usually to fC). It causes most programs to stop execution. Certain pro- grams, such as the shell and the editors, handle an interrupt in special ways, usually by stopping what they are doing and prompting for another command. While the shell is executing another command and waiting for it to finish, the shell does not listen to interrupts. The shell often wakes up when you hit interrupt because many commands die when they receive an interrupt (1.8, 3.9). Introduction to the C Shell job 4-69 One or more commands typed on the same input line separated by ¢ or characters are run together and are called a job. 6,9 Simple commands run by themselves without any ¢ or ¢’ characters are the simplest jobs. Jobs are classified as foreground, background, or suspended (2.6). job control The builtin functions that control the execution of jobs are called job control commands. These are bg, fg, stop, kill (2.6). job number When each job is started it is assigned a small number called a job number which is printed next to the job in the output of the jobs command. This number, preceded by a ‘%’ character, can be used as an argument to job control commands to indicate a specific job (2.6). jobs The jobs command prints a table showing jobs that are either running in the kill A command which sends a signal to a job causing it to terminate (2.6). Jlogin The file .login in your home directory is read by the shell each time you login background or are suspended (2.6). to UNIX and the commands there are executed. There are a number of com- mands which are usefully placed here, especially set commands to the shell b 1L itself login shell /O 1) (2.1). The shell that is started on your terminal when you login is called your login shell. It is different from other shells which you may run (e.g. on shell scripts) in that it reads the .login file before reading commands from the terminal and it reads the .logout file after you logout (2.1). logout The logout command causes a login shell to exit. Normally, a login shell will exit when you hit control-d generating an end-of-file, but if you have set ignoreeof in you .login file then this will not work and you must use logout to log off the UNIX system (2.8). Jogout When you log off of UNIX the shell will execute commands from the file dogout in your home directory after it prints ‘logout’. lpr The command [pr is the line printer daemon. spooled and printed on the UNIX line printer. filenames as arguments to be printed. The standard input of Ipr You can also give [pr a list of It is most common to use lpr as the last component of a pipeline (2.3). Is The [s (list files) command is one of the most commonly used UNIX com- mands. With no argument filenames it prints the names of the files in the current directory. It has a number of useful flag arguments, and can also be given the names of directories as arguments, in which case it lists the names of the files in these directories (1.2). The mail program is used to send and receive messages from other UNIX users (1.1, 2.1). | The make command is used to maintain one or more related files and to organize functions to be performed on these files. In many ways make is easier to use, and more helpful than shell command scripts (3.2). AABLA AR/ A B A 'T‘}\o file containino com manfl CRLANAND manual The manual often referred to is the ‘UNIX programmer’s manual’. AR ARAT UVJALUCALEARLLX & LUririil (\7‘ 144] WIVT \JL TV L3 COEATUA Frvie s v Iuv (3.2). \v It contains a number of sections and a description of each UNIX program. An online ver- sion of the manual is accessible through the man command. Its documenta- tion can be obtained online via man man metacharact Many characters which are neither letters nor digits have special meaning 4-70 Introduction to the C Shell either to the shell or to UNIX. These characters are called metacharacters. If it is necessary to place these characters in arguments to commands without them having their special meaning then they must be quoted. An example of a metacharacter is the character ‘>’ which is used to indicate placement of output into a file. For the purposes of the history mechanism, most unquoted metacharacters form separate words (1.4). The appendix to this user’s manual lists the metacharacters in groups by their function. mkdir The mkdir command is used to create a new directory. modifier Substitutions with the history mechanism, keyed by the character ‘I’ or of variables using the metacharacter ‘$’, are often subjected to modifications, indicated by placing the character ‘.’ after the substitution and following this with the modifier itself. The command substitution mechanism can also be ‘used to perform modification in a similar way, but this notation is less clear (3.6). more noclobber The program more writes a file on your terminal allowing you to control how much text is displayed at a time. More can move through the file screenful by screenful, line by line, search forward for a string, or start again at the beginning of the file. It is generally the easiest way of viewing a file (1.8). The shell has a variable noclobber which may be set in the file .login to prevent accidental destruction of files by the ‘>’ output redirection metasyntax of the shell (2.2, 2.5). noglob The shell variable noglob is set to suppress the filename expansion of arguments containing the metacharacters 7, “*’, ‘?’, ‘[” and ‘]’ (3.6). notify The notify command tells the shell to report on the termination of a specific background job at the exact time it occurs as opposed to waiting until just before the next prompt to report the termination. The notify variable, if set, causes the shell to always report the termination of background jobs exactly when they occur (2.6). onintr The onintr command is built into the shell and is used to control the action of a shell command script when an interrupt signal is received (3.9). output Many commands in UNIX result in some lines of text which are called their output. This output is usually placed on what is known as the standard output which is normally connected to the user’s terminal. The shell has a syntax using the metacharacter ‘>’ for redirecting the standard output of a com- mand to a file (1.3). Using the pipe mechanism and the metacharacter I it is also possible for the standard output of one command to become the standard input of another command (1.5). Certain commands such as the line printer daemon p do not place their results on the standard output but rather in more useful places such as on the line printer (2.3). Similarly the write command places its output on another user’s terminal rather than its standard output (2.3). Commands also have a diagnostic output where they write their error messages. Normally these go to the terminal even if the standard output has been sent to a file or another command, but it is possible to direct error diagnostics along with standard output using a special metanotation (2.5). pushd The pushd command, which means ‘push directory’, changes the shell’s working directory and also remembers the current working directory before the change is made, allowing you to return to the same directory via the popd command later without retyping its name (2.7). path The shell has a variable path which gives the names of the directories in which it searches for the commands which it is given. It always checks first to see if the command it is given is built into the shell. If it is, then it need not Introduction to the C Shell 4-71 search for the command as it can do it internally. If the command is not builtin, then the shell searches for a file with the name given in each of the directories in the path variable, left to right. Since the normal definition of the path variable is path (. /usr/ucb /bin /usr/bin) the shell normally looks in the current directory, and then in the standard system directories ‘/usr/ucb’, ‘/bin’ and ‘/usr/bin’ for the named command (2.2). If the command cannot be found the shell will print an error diagnostic. Scripts of shell commands will be executed using another shell to interpret them if they have ‘execute’ permission set. This is normally true because a command of the form chmod 755 script was executed to turn this execute permission on (3.3). If you add new commands to a directory in the path, you should issue the command rehash (2.2). pathname A list of names, separated by ¢/’ characters, forms a pathname. Each component, between successive ¢/’ characters, names a directory in which the next component file resides. Pathnames which begin with the character /° are interpreted relative to the root directory in the filesystem. Other pathnames are interpreted relative to the current directory as reported by pwd. The last component of a pathname may name a directory, but usually names a file. pipeline A group of commands which are connected together, the standard output of each connected to the standard input of the next, is called a pipeline. The pipe mechanism used to connect these commands is indicated by the shell metacharacter ¥ (1.5, 2.3). popd port The popd command changes the shell’s working directory to the directory you most recently left using the pushd command. It returns to the directory without having to type its name, forgetting the name of the current working directory before doing so (2.7). The part of a computer system to which each terminal is connected is called a port. Usually the system has a fixed number of ports, some of which are connected to telephone lines for dial-up access, and some of which are permanently wired directly to specific terminals. pr The pr command is used to prepare listings of the contents of files with headers giving the name of the file and the date and time at which the file was last modified (2.3). printenv The printenv command is used to print the current setting of variables in the environment (2.8). process An instance of a running program is called a process (2.6). UNIX assigns each process a unique number when it is started — called the process number. Process numbers can be used to stop individual processes using the kill or stop commands when the processes are part of a detached background job. program Usually synonymous with command; a binary file or shell command script which performs a useful function is often called a program. programmer’s manuals manual’u>(750u+1n) .br Also referred to as the manual. See the glossary entry for ‘manual’. prompt Many programs will print a prompt on the terminal when they expect input. Thus the editor ‘ex(1)’ will print a ’ when it expects input. The shell prompts for input with ‘% ’ and occasionally with ‘? ’ when reading commands from the terminal (1.1). The shell has a variable prompt which may 4-72 Introduction to the C Shell be set to a different value to change the shell’s main prompt. This is mostly used when debugging the shell (2.8). ps The ps command is used to show the processes you are currently running. Each process is shown with its unique process number, an indication of the terminal name it is attached to, an indication of the state of the process (whether it is running, stopped, awaiting some event (sleeping), and whether it is swapped out), and the amount of CPU time it has used so far. The com- mand is identified by printing some of the words used when it was invoked (2.6). Shells, such as the csh you use to run the ps command, are not nor- mally shown in the output. pwd The pwd command prints the full pathname of the current working direc- tory. The dirs builtin command is usually a better and faster choice. quit The quit signal, generated by a control-x is used to terminate programs which are behaving unreasonably. It normally produces a core image file (1.8). quotation The process by which metacharacters are prevented their special meaning, usually by using the character ¢ in pairs, or by using the character ‘X, is referred to as quotation (1‘7). redirection The routing of input or output from or to a file is known as redirection of input or output (1.3). rehash | The rehash command tells the shell to rebuild its internal table of which commands are found in which directories in your path. This is necessary when a new program is installed in one of these directories (2.8). relative pathname A pathname which does not begin with a ‘/’ is called a relative pathname since it is interpreted relative to the current working directory. The first component of such a pathname refers to some file or directory in the working directory, and subsequent components between ‘/° characters refer to direcPathnames that are not relative are tories below the working directory. called absolute pathnames (1.6). repeat The repeat command iterates another command a specified number of times. root The directory that is at the top of the entire directory structure is called the root directory since it is the ‘root’ of the entire tree structure of directories. The name used in pathnames to indicate the root is ‘/’. Pathnames starting with ¢/’ are said to be absolute since they start at the root directory. Root is also used as the part of a pathname that is left after removing the extension. See filename for a further explanation (1.6). RUBOUT The RUBOUT or DELETE key sends an interrupt to the current job. Most interactive commands return to their command level upon receipt of an interrupt, while non-interactive commands usually terminate, returning control to the shell. Users often change interrupt to be generated by {{C rather than DELETE by using the stty command. scratch file Files whose names begin with a ‘#’ are referred to as scratch files, since they are automatically removed by the system after a couple of days of non-use, or more frequently if disk space becomes tight (1.3). script Sequences of shell commands placed in a file are called shell command scripts. It is often possible to perform simple tasks using these scripts without writing a program in a language such as C, by using the shell to selectively run other programs (3.3, 3.10). set The builtin set command is used to assign new values to shell variables and to show the values of the current variables. Many shell variables have special meaning to the shell itself. Thus by using the set command the behavior of Introduction to the C Shell 4-73 the shell can be affected (2.1). Variables in the environment ‘environ(5)’ can be changed by using the setenv builtin command (2.8). The printenv command can be used to print the setenv value of the variables in the environment. shell A shell is a command language interpreter. It is possible to write and run your own shell, as shells are no different than any other programs as far as the system is concerned. This manual deals with the details of one particular shell, called csh. shell script See script (3.3, 3.10). signal A signal in UNIX is a short message that is sent to a running program which causes something to happen to that process. Signals are sent either by typing special control characters on the keyboard or by using the kill or stop commands (1.8, 2.6). sort The sort program sorts a sequence of lines in ways that can be controlled by argument flags (1.5). source The source command causes the shell to read commands from a specified file. It is most useful for reading files such as .cshrc after changing them (2.8). special character See metacharacters and the appendix to this manual. standard We refer often to the standard input and standard output of commands. See input and output (1.3, 3.8). status A command normally returns a status when it finishes. By convention a status of zero indicates that the command succeeded. Commands may return non-zero status to indicate that some abnormal event has occurred. The shell variable status is set to the status returned by the last command. It is most useful in shell commmand scripts (3.6). stop The stop command causes a background job to become suspended (2.6). string A sequential group of characters taken together is called a string. can contain any printable characters (2.2). stty The stty program changes certain parameters inside UNIX which determine Strings how your terminal is handled. See ‘stty(1)’ for a complete description (2.6). substitution The shell implements a number of substitutions where sequences indicated by metacharacters are replaced by other sequences. Notable examples of this are history substitution keyed by the metacharacter ‘I’ and variable substitution indicated by ‘$’. We also refer to substitutions as expansions (3.4). suspended A job becomes suspended after a STOP signal is sent to it, either by typing a control-z at the terminal (for foreground jobs) or by using the stop command (for background jobs). When suspended, a job temporarily stops running until it is restarted by either the fg or bg command (2.6). switch The switch command of the shell allows the shell to select one of a number of sequences of commands based on an argument string. switch statement in the language C (3.7). It is similar to the termination When a command which is being executed finishes we say it undergoes termination or terminates. Commands normally terminate when they read an end-of-file from their standard input. It is also possible to terminate commands by sending them an interrupt or quit signal (1.8). The kill program terminates specified jobs (2.6). then The then command is part of the shell’s ‘if-then-else-endif’ control construct used in command scripts (3.6). 4-74 Introduction to the C Shell time The time command can be used to measure the amount of CPU and real time consumed by a specified command as well as the amount of disk i/0, memory utilized, and number of page faults and swaps taken by the command (2.1, 2.8). tset The tset program is used to set standard erase and kill characters and to tell the system what kind of terminal you are using. dogin file (2.1). tty It is often invoked in a | The word tty is a historical abbreviation for ‘teletype’ which is frequently used in UNIX to indicate the port to which a given terminal is connected. The tty command will print the name of the tty or port to which your terminal is presently connected. unalias UNIX The unalias command removes aliases (2.8). UNIX is an operating system on which csh runs. UNIX provides facilities which allow csh to invoke other programs such as editors and text formatters which you may wish to use. unset The unset command removes the definitions of shell variables (2.2, 2.8). variable expansion See variables and expansion (2.2, 3.4). variables Variables in csh hold one or more strings as value. The most common use of variables is in controlling the behavior of the shell. See path, noclobber, and ignoreeof for examples. Variables such as argv are also used in writing shell programs (shell command scripts) (2.2). verbose The verbose shell variable can be set to cause commands to be echoed after they are history expanded. This is often useful in debugging shell scripts. The verbose variable is set by the shell’s —v command line option (3.10). WwC The we program calculates the number of characters, words, and lines in the files whose names are given as arguments (2.6). while The while builtin control construct is used in shell command scripts (3.7). word A sequence of characters which forms an argument to a command is called a word. Many characters which are neither letters, digits, ‘=’, ‘.” nor ¢/’ form words all by themselves even if they are not surrounded by blanks. Any sequence of characters may be made into a word by surrounding it with characters except for the characters ” and ‘!’ which require special treatment (1.1). This process of placing special characters in words without their special meaning is called quoting. working directory At any given time you are in one particular directory, called your working directory. This directory’s name is printed by the pwd command and the files listed by Is are the ones in this directory. You can change working directories using chdir. write The write command is used to communicate with other users who are logged in to UNIX. Introduction 5-1 PART 5: DOCUMENT PREPARATION This part includes articles on the features and utilities of the ULTRIX-32 system that will help you to prepare written information for publication. Seven of the articles deal with nroff and troff, the text formatters that convert unformatted text into a formal document ready for output on a printer or typesetter. Nroff produces output printable on a typewriter-like terminal, line printer, or terminal screen. Troff prepares output for a phototypesetter. Five other articles explain the uses of egn, tbl, and refer. These are utilities that cooperate with the text processors to produce mathematical equations, tables, and bibliographical references in the text formatted by nroff or troff. An additional article describes the style and diction programs, tools that provide criteria for evaluating written material. Nroff and Troff Formatting a document on the ULTRIX-32 system is a two-stage process. In stage one, you create or change a file using vi or one of the other editors. This file should contain the text to be processed and commands to the text formatter. The commands tell the formatter how to treat the text, for example how wide to make the margins, when to start a new paragraph, and when to leave the right margin unjustified. In stage two, you give a command to the shell tel- ling nroff or troff to process the text in the file you created. Nroff and troff are compatible, so that one text file can generally serve as a source for both line printer output and typesetter output. The text processors allow you to define exactly how you want your text to look. However, developing a format that is consistent throughout a document involves repeating many details (consider page headers and multicolumn formats, for example). ULTRIX-32 includes two macro packages (-ms and -me) that specify many details and simplify the specification of other details for you. These macro packages serve to reduce your direct contact with nroff and troff, making the text formatting process easier. The articles by Lesk, “Typing Docu- ments on the UNIX System: Using the -ms Macros with TROFF and NROFF,” and Tuthill, “A Revised Version of -ms,” tell what there is to know about using -ms. ing Documents with -ms,” also by Lesk, gives comprehensive examples. The topics include: e (Cover sheet format such as author, title, abstract » Page headings e Multicolumn format » Section headings o Paragraph control o Italics e Footnotes « Specifying dates “A Guide to Prepar- 5-2 Introduction e (Changing default values » Using accent marks » Automatic footnote numbering The Lesk article is readable and arranged in a tutorial format. The Tuthill article is a brief supplement. “Writing Papers with NROFF Using -ME,” by Allman, covers many of the same topics. article is also tutorial. This It provides good explanations and examples. The “ME Reference Manual,” also by Allman, lists all features of the -me macro package. Read it if you want greater flexibility than is allowed by the procedures shown in the first Allman article. The “NROFF/TROFF User’s Manual,” by Ossanna, is appropriate for users already familiar with the macro packages who want to develop their own nroff or troff macros or macro packages. The first part of this article lists the command line options for the text formatters, all nroff and troff commands, escape sequences, and predefined registers. The second part defines in detail the rules that govern use of the text formatters. A set of examples completes the article. “A TROFF Tutorial,” by Kernighan, concentrates on features of troff that are specific to typesetting such as: e Point sizes e Font changes e Special characters e Horizontal and vertical motions The information in this article is appropriate for users who want more flexibility in typesetter control than they can get with the -ms and -me macro packages. Preprocessors Three preprocessor utilities expand the text formatting capabilities of the ULTRIX-32 sys| tem: eqn lets you typeset mathematical expressions. tbl helps you to format tables easily. refer helps you to create bibliographical references. These utilities process notation for mathematical expressions, tables, and bibliographical descriptions, turning them into sequences of commands for nroff or troff. This part includes two articles on egn by Kernighan and Cherry. Mathematics” outlines the design goals and capabilities of eqn. “A System for Typesetting The second article, “Typeset- ting Mathematics - User’s Guide,” shows how to make egn produce: e Kquations Special symbols e Greek letters o Subscripts and superscripts » Braces e Piles e Matrices e Local motions Read this second article for practical information on using eqn. Read the first eqgn article, “A System for Typesetting Mathematics,” if you want to know more about the background of eqn. Introduction 5-3 eqn. “TBL - A Program to Format Tables,” by Lesk, serves as a reference and a tutorial. part of the article lists rules for using tbl to create tables. The first The remainder of the article con- sists of examples showing sequences of commands supplied to tbl and the resulting tables. Three of the articles in this part deal with utilities related to bibliographies and indexing. Using refer to make bibliographical references in a text requires three steps: 1 You must build a data base that describes the items that can be referenced. Each entry in the data base identifies the publication by several categories such as: e Author o Title o Issuer (publisher) o (City where published e Date of publication Enter this information by running the addbib utility. Note that you can list the entire data base, sorted by author and date, by running the sortbib and roffbib utilities. 2 As you write a new text to be processed by nroff or troff, you can create a standard bibliographical reference to an item contained in the data base by specifying one or two key fields of the data base item. 3 Run the refer and nroff or troff utilities to process the text. Tuthill’s article, “Refer - A Bibilography System,” is the most readable and useful of the three articles on refer. The Lesk articles, “Some Applications of Inverted Indexes on the UNIX System” and “Updating Publication Lists,” deal with indexing and bibliographical referencing. ples that relate to refer may be useful, if you read the Tuthill article first. of indexing are hard to follow. The exam- The explanations If you must use the searching and indexing utilities, you may want help from someone who uses this software to supplement the Lesk articles. Style and Diction The style and diction programs can help you evaluate and refine writing skills. be evaluated can be in a file on the system. The texts to The article entitled “Writing Tools - The Style and Diction Programs,” by Cherry and Vesterman, explains the yardsticks that style uses to measure: Readability levels e Sentence structure e Word usage (by parts of speech) e Sentence openers The article also shows how to use the diction program to identify phrases that are frequently misused or WOI’dy You can use the BL[JLULIL prograim together with a diction to find substitutes T e RO SRR :4 s+ 1. D ks e for the objectionable phrases. Summary The articles on -ms and -me (choose one) will help you to get started using nroff and troff. Eqgn, tbl, and refer work with nroff and troff to simplify typesetting mathematical expressions, formatting tables, and making bibliographical references. LN VoS R RS a (O -5 QW § help you to evaluate what you write. The style and diction programs will Typing Documents on the UNIX System 5-5 Typing Documents on the UNIX System: Using the —ms Macros with Troff and Nroff M. E. Lesk Bell Laboratories Murray Hill, New Jersey 07974 Introduction. This memorandum describes a package of commands to produce papers using the troff and nroff formatting programs on the UNIX system. As with other roff -derived programs, text is prepared interspersed with formatting commands. However, this package, which itself is written in troff commands, provides higher-level commands than those pro- vided with the basic troff program. Appendix A. The commands available in this package are listed in Text. Type normally, except that instead of indenting for paragraphs, place a line reading “.PP” before each paragraph. This will produce indenting and extra space. Alternatively, the command .LP that was used here will produce a left-aligned (block) paragraph. The paragraph spacing can be changed: see below under “Registers.” Beginning. For a document with a paper-type cover sheet, the input should start as fol- lows: | [optional overall format .RP — see below] TL Title of document (one or more lines) AU Author(s) (may also be several lines) Al | Author’s institution(s) AB Abstract; to be placed on the cover sheet of a paper. Line length is 5/6 of normal; use .1l here to change. AE (abstract end) text ... (begins with .PP, which see) To omit some of the standard headings (e.g. no abstract, or no author’s institution) just omit the corresponding fields and command lines. The word ABSTRACT can be suppressed by writing “.AB no” for “.AB”. Several interspersed .AU and .Al lines can be used for multiple authors. The headings are not compulsory: beginning with a .PP command is perfectly OK Warning: You can’t just begin a document with a line of text. Some —ms command must precede any text input. When in doubt, use .LP to get proper initialization, although any of the commands .PP, .LP, .TL, .SH, .NH is good enough. Figure 1 shows the legal arrangement of commands at the start of a document. and will just start printing an ordinary paragraph. Cover Sheets and First Pages. The first line of a document signals the general format of the first page. In particular, if it is ”.RP” a cover sheet with title and abstract is prepared. The default format is useful for scanning drafts. UNIX is a Trademark of Bell Laboratories 5-6 Typing Documents on the UNIX System In general —ms is arranged so that only one form of a document need be stored, containing all information; the first command gives the format, and unnecessary items for that for- mat are ignored. Warning: don’t put extraneous material between the .TL and .AE commands. Processing of the titling items is special, and other data placed in them may not behave as you expect. Don’t forget that some —ms command must precede any input text. Page headings. The —ms macros, by default, will print a page heading containing a page number (if greater than 1). date is used. A default page footer is provided only in nroff, where the The user can make minor adjustments to the page headings/footings by redefining the strings LH, CH, and RH which are the left, center and right portions of the page headings, respectively; and the strings LF, CF, and RF, which are the left, center and right portions of the page footer. For more complex formats, the user can redefine the macros PT and BT, which are invoked respectively at the top and bottom of each page. The margins (taken from registers HM and FM for the top and bottom margin respectively) are normally 1 inch; the page header/footer are in the middle of that space. The user who redefines these macros should be careful not to change parameters such as point size or font without resetting them to default values. Multi-column formats. If you place the command “.2C” in your document, the 1. Care and Feeding of Department Heads document will be printed in double column format beginning at that point. Alternatively, This .SH feature is not too useful in computer termi- Care and Feeding of Directors nal output, but is often desirable on the typesetter. The command “.1C” will go back to one-column format and also skip to a new page. The “.2C” command is actu- ally a special case of the command will makes multiple columns with the columns as will fit across the page are used. Thus triple, quadruple, ... column pages can printed. Whenever the heading with no number Every section heading, of either type, specified column and gutter width; as many be the Care and Feeding of Directors MC [column width [gutter Width]] which print added: number should be followed by a paragraph beginning with .PP or .LLP, indicating the end of the heading. Headings may contain more than one line of text. The of .NH command complex numbering also supports columns is changed (except going from full more width to some larger number of columns) a numerical argument is given, it is taken to a special To produce heading, there are two commands. If you type sub-section number is as in this example: Erie-Lackawanna type section heading here NH 2 may be several lines Morris and Essex Division you will get automatically numbered section headings (1, 2, 3, ...), in boldface. For example, .NH Care and Feeding of Department Heads produces generated. Larger level numbers indicate deeper sub-sections, .NH . NH If a be a ‘“level”” number and an appropriate new page is started. Headings. schemes. Gladstone Branch .NH 3 Montclair Branch NH 2 Boonton Line generates: Typing Documents on the UNIX System (in character positions) and will remain in 2. Erie-Lackawanna effect until the next .PP or .LP. 2.1. Morris and Essex Division “NH 0” will .NH 0 the And so forth. .LP produces this: first: Notice the longer label, requiring larger indenting for these graphs. 1. Penn Central paragraphs. (Paragraphs with hanging numbers, e.g. references.) The sequence second: It is para- | And so forth. also possible to produce multiple nested indents; the command .RS indicates that the next .IP starts from the current 1P [1] Text for first paragraph, typed normally for as long as you would like on as many lines as needed. IP [2] Text for second paragraph, ... produces indentation level. and .RE commands. example IP 1. Bell Laboratories RS JP 1.1 Text for second paragraph, ... Murray Hill A series of indented paragraphs may be fol- JP 1.2 lowed by an ordinary paragraph beginning Holmdel with .PP or .LP, depending on whether you P 1.3 wish indenting or not. Whippany The command .LP RS was used here. JP 1.3.1 More sophisticated uses of .IP are also Madison possible. If the label is omitted, for example, a plain block indent is produced. RE JP 1.4 IP Chester This material will RE just be turned into a LP LP will produce This material will just be turned into a block indent suitable for quotations will result in 1. Bell Laboratories 1.1 Murray Hill 1.2 Holmdel 1.3 Whippany or such matter. If a non-standard amount of indenting is required, it may be specified after the label .RS command the .RE command as “move left”. as many lines as needed. such matter. The should be thought of as “move right” and Text for first paragraph, typed nor- block indent suitable for quotations or Each .RE will eat up one level of indenting so you should balance .RS mally for as long as you would like on [2] and IP second: reset the Penn Central [1] label indenting for these paragraphs. numbering of level 1 to one, as here: Indented the For example, Notice the longer label, requiring larger 2.2. Boonton Line explicit fields: IP first: 9 2.1.2. Montclair Branch An additional indenting length. Gladstone Branch Thus, the general form of the .IP command contains two 2.1.1. 5-7 1.3.1 Madison 1.4 Chester As an 5-8 Typing Documents on the UNIX System All of these variations on .LP leave the FE (footnote end) will be collected, remem- right bered, and finally placed at the bottom of margin untouched. Sometimes, for purposes such as setting off a quotation, a the current page*. paragraph indented on both right and left are 11/12th the length of normal text, but is required. this can be changed using the FL register (see below). A single paragraph like this is obtained QP. by preceding Displays it with More complicated material (several paragraphs) should To get italics and Tables. To prepare displays of lines, such as tables, in which the lines should not be re-arranged, enclose be them in the commands .DS and .DE bracketed with .QS and .QE. Emphasis. By default, footnotes (on the typesetter) or underlining (on the terminal) DS table lines, like the examples here, are placed say between .DS and .DE A .DE as much text as you want can be typed here By default, lines between .DS and .DE are R as was done for these three words. The .R command restores the mnormal (usually Roman) font. If only one word is to be italicized, it may be just given on the line with the .I command, J word indented and left-adjusted. You can also center left retain the lines bracketed by .DS L and .DE are left- adjusted, not arranged. A plain .DS is equivalent to .DS indented, and not re- I, which indents and left-adjusts. Thus, .B Text to be set in boldface goes here these lines were preceded by .DS C and followed by a .DE command; whereas these lines were preceded by .DS L and followed by R and also will be underlined on the terminal As with .I, a single word can be placed in boldface by placing it on the same line as the .B command. A few size changes can be specified similarly with the commands .LG (make larger), .SM (make smaller), and .NL (return to normal size). The size change is two points; the commands may be repeated for increased effect (here one .NL canceled two .SM commands). a .DE command. Note that .DS C centers each line; there is a variant .DS B that makes the display into a left-adjusted block of text, and then centers that entire block. italicizing is required on the typesetter, the command Normally a display is kept together, on one page. across page boundaries, use .CD, .LD, or D in place of the commands .DS C, .DS L, or .DS I respectively. An extra argument to the .DS I or .DS command is taken as an Note: it is tempting to assume that .DS R will right adjust lines, but it doesn’t work. Boxing words or lines. To draw rec- tangular boxes around words the command .UL word will underline a word. If you wish to have a long display which may be split amount to indent. If actual underlining as opposed to There is no way to underline multiple words on the typesetter. Footnotes. margin. mands are centered (and not re-arranged); and in this case no .R is needed to restore Boldface can be produced by 11111 or Lines bracketed by .DS C and .DE com- the previous font. or line printer. lines, Material placed between .BX word Typing Documents on the UNIX System will print as shown. The boxes will not be neat on a terminal, and this should ment to .SG not be used as a substitute for italics. natures. Longer pieces of text may be boxed by released paper format. by typing can be of the altered registers to change They should be changed nr PS 9 as has been done here. Keeping blocks together. together a with .nr commands, as with B2 keep Certain —ms default settings. text... as The .SG command is ignored in Registers. used .B1 used identification line, and placed after the sig- enclosing them with .B1 and .B2: to is 5-9 If you wish a table or other block of lines on a page, release” commands. there are “keep - If a block of lines pre- ceded by .KS and followed by .KE does not fit on the remainder of the current page, it will begin on a new page. Lines bracketed by .DS and .DE commands are automati- to make the default point size 9 point. If the effect is needed immediately, the nor- mal troff command should be used in addition to changing the number register. Register Defines PS point size VS line spacing Takes Default effect next para. next para. 10 12 pts LL LT line length title length next para. next para. be kept together is preceded by .KF instead next para. 0.3 VS PI para. indent next para. 5 ens of .KS and does not fit on the current page, FL CW GW footnote length column width intercolumn gap next F'S next 2C next 2C 11/12 LL 7/15 LL 1/15 LL cally kept together this way. There is also a “keep floating” command: if the block to it will be moved down through the text until the top of the next page. Thus, no large blank space will be introduced in the document. PD PO page offset next page 26/27\” HM top margin next page 1\”’ FM Nroff/Troff commands. Among the useful commands from the basic formatting programs are the following. They all work with both typesetter and computer terminal output: Dbottom margin next page 1\” You may also alter the strings LH, CH, and RH which are the left, center, and right headings respectively; and similarly LLF, CF, and RF which are strings in the page footer. The page number on output is taken from register PN, to permit changing its output .bp - begin new page. style. .br - “break”, stop running text .na - don’t adjust right margins. default, documents pro- and BT can be in ”.RP” format places the specified date on the cover sheet and nowhere else. this line before the title. Place You can obtain a sig- nature line by placing the command .SG in The authors’ names will be An argu- To simplify typing certain accent marks are defined. They precede the letter over which the mark is to appear. Here are the strings: .ND May 8, 1945 output in place of the .SG line. PT foreign words, strings representing common at the bottom of each page. The command the document. macros Accents. duced on computer terminals have the date at the bottom of each page; documents produced on the typesetter don’t. To force the date, say ‘“.DA”. To force no date, say “ND”. To lie about the date, say “.DA July 4, 1776 which puts the specified date Signature line. the redefined, as explained earlier. .Sp n - insert n blank lines. By For more complicated headers and footers from line to line. Date. para. spacing 6\ > 6\ Input Output Input Output \* ‘e \*'e \*:u e e ! \¥"a \*Ce \*,c a e ,C \""e & Use. After your document is prepared and stored on a file, you can print it on a terminal with the command* * If .2C was used, pipe the nroff output through col; make /usr/bin/col.” the first line of the input “.pi 5-10 Typing Documents on the UNIX System References nroff —ms file and you can print it on the typesetter with the command [1] B. W. Kernighan and L. L. Cherry, Typesetting Mathematics — troff —ms file Computing Science Report no. 17. (many options are possible). In each case, M. E. Lesk, Tbl — A Program to For- if your document is stored in several files, mat just list all the filenames where we have puting Science Report no. 45. used “file”. If equations or tables are used, eqn and/or tbl must be invoked as prepro- Bell Laboratories, 1976. CeSsors. References and further study. If you have to do Greek or mathematics, see eqn [1] for equation setting. To aid egn users, —ms provides definitions of .EQ and .EN which normally center the and set it off slightly. is taken to be equation An argument on .EQ an equation number and placed in the right margin near the equa- tion. In addition, there are three special arguments to EQ: the letters C, I, and L indicate centered left adjusted (default), indented, and equations, respectively. If there is both a format argument and an equation number, give the format argument first, as in EQ L (1.3a) for a left-adjusted equation numbered (1.3a). Similarly, the macros .T'S and .TE are defined to separate tables (see text with a little space. [2]) from A very long table with a heading may be broken across pages by beginning it with . TS H instead of .TS, and placing the line ."TH in the table data after the heading. If the table has no head- ing repeated from page to page, just use the ordinary .T'S and .TE macros. To learn more about troff see [3] for a general introduction, and [4] for the full details (experts only). Information related UNIX commands is in [5]. on For jobs that do not seem well-adapted to —ms, consider other macro packages. It is often far easler to write a specific macro packages for such tasks as imitating particular journals than to try to adapt —ms. Acknowledgment. | Many thanks are due to Brian Kernighan for his help in the design and implementation of this package, and for manual. his Users Guide (2nd edition), Bell Laboratories assistance in preparing this Tables, Bell Laboratories Com- B. W. Kernighan, A Troff Tutorial, J. F. Ossanna, Nroff /Troff Reference Manual, Bell Laboratories Computing Science Report no. 51. K. Thompson UNIX and D. Programmer’s Laboratories, 1978. M. Ritchie, Manual, Bell Typing Documents on the UNIX System 5-11 Appendix A List of Commands 1C Return to single column format. LG Increase type size. 2C Start double column format. LP Left aligned block paragraph. AB Begin abstract. AE End abstract. Al Specify author’s institution. AU Specify author. ND Change or cancel date. B Begin boldface. NH Specify numbered heading. DA Provide the date on each page. NL Return to normal type size. DE End display. PP Begin paragraph. DS Start display (also CD, LD, ID). EN End equation. R Return to regular font (usually Roman). EQ Begin equation. RE End one level of relative indenting. FE End footnote. RP Use released paper format. FS Begin footnote. RS Relative indent increased one level. SG I Begin italics. SH Insert signature line. Specify section heading. IP Begin indented paragraph. TL Change to smaller type size. Specify title. UL Underline one word. SM KE Release keep. KF Begin floating keep. KS Start keep. Register Names The following register names are used by —ms internally. Independent use of these names in one’s own macros may produce incorrect output. Note that no lower case letters are usedin any —ms internal name. | Number registers used in —ms : DW GW HM IQ LL NA OdJ PO T. H#HT EF H1 HT IR LT NC PD PQ TB TV VS 1T FL H3 IK KI MM NF PF PX TD YE AV FM H4 IM L1 MN NS PI RO TN YY CW FP H5 IP LE MO Ol PN ST TQ ZN ’ A5 CB DW EZ I KF MR R1 RT TL ) AB CC DY FA I1 KQ ND R2 SO TM ) AE CD E1 FE I2 KS NH R3 S1 TQ ) Al CF E2 FJ I3 LB NL R4 S2 TS : AU CH E3 FK I4 LD NP R5 SG TT , B CM E4 FN I5 LG OD RC SH UL 1C BG CS Eb5 FO ID LP OK RE SM WB 2C BT CT EE FQ IE ME PP RF SN WH Al C D EL FS IM MF PT RH SY WT A2 C1 DA EM FV IP MH PY RP TA XD A3 C2 DE EN FY 17 MN QF RQ TE XF A4 CA DS EQ HO KE MO R RS TH XK String registers used in —ms Typing Documents on the UNIX System Order of Commands in Input / : = NH, SH e PP, LP text ... e 5-12 Figure 1 A Guide to -ms 5-13 Commands for a TM .TM 1978-5b3 39999 99999-11 ND April 1, 1976 TJL The Role of the Allen Wrench in Modern A Guide to Preparing Documents with —ms Electronics AU "MH 2G-111" 2345 M. E. Lesk Bell Laboratories August 1978 This guide gives some simple examples of document preparation on Bell Labs computers, emphasizing the use of the -—ms macro package. It enormously abbreviates information in 1. Typing Documents on UNIX and GCOS, by M. E. Lesk; 2. Typesetting Mathematics — User's Guide, by B. W. Kernighan and L. L. Cherry; and 3. Tbl — A Program to Format Tables, by M. E. Lesk. These memos are all included in the UNIX Programmer’s Manual, Volume 2. The new user should also have 4 Tutorial Introduction to the UNIX Text Editor, by B. W. Kernighan. For more detailed information, read Advanced Editing on UNIX and A Troff Tutorial, by B. W, J. Q. Pencilpusher AU "MH 1K-222" 5432 X. Y. Hardwired Al MH OK Tools Design AB This abstract should be.short enough to fit on a single page cover sheet. ) It must attract the reader into sending for the compiete memorandum. AE LS 10212567 NH Introduction. PP Now the first paragraph of actual text .. Last line of text. SG MH-1234-JQP/XYH-unix NH References ... Commands not needed in a particular format are ignored. Kernighan, and (for experts) Nroff/Troff Reference Manual by J. F. Ossanna. Information on related commands is found (for UNIX users) in UNIX for Beginners by B. W. Kernighan and the UNIX Programmer’s Manual by K. Thompson and D. M. Ritchie. 2 3 4 5 6 7 8 Throughout the examples, input is shown in this Helvetica sans serif font while the resuiting output is shown in this Times Roman font. UNIX Document no. 1111 This informanion is for empiovees of Bell Laborarones. Tite-The Role of the Allen Wrench in Modern Electronics Contents ATM ... . Areleased paper ............. An internal memo, and headings ... Lists, displays, and footnotes . . . .. Indents, keeps, and double column . Equations and registers . ........ Tablesand usage ............. Cover Sheet for TM Bell Laboratories (GEl 13.9-3) Date-April 1, 1976 TTM- 1978-5b3 Other Keywords- Tools Design Author Location J. Q. Pencilpusher X.Y. Hardwired MH 2G-111 2345 Filing Case- 999992 MH 1K-222 5432 Ext. Charging Case- 99999 ABSTRACT This abstract should be short enough to fit on a single page cover sheet. It must attract the reader into sending for the complete memorandum. Pages Text IC No. Figures 5 E-1932-U (6-73) Other 2 No. Tables Totai 6 12 No. Refs. 7 SEE REVERSE SIDE FOR DISTRIBUTION LIST 5-14 A Guide to -ms An Internal Memorandum A Released Paper with Mathematics EQ delim $$ EN RP AM ND January 24, 1956 TJLU The 19568 Consent Decree AU Able, Baker & ... (as for a TTMM) Charley, Attys. PP L£8102125867 NH Plaintiff, United States of America, having filed its complaint herein on January 14, 1949; the introduction PP The solution to the torque handle equation EQ (1) sumfromQtoinfF(xsubi) =G (x) EN is found with the transformation $ x = rho over theta $ where S rho = G prime (x) $ and $thetas defendants having appeared and filed their answer to such complaint denying the substantive allegations thereof; and the parties, by their attorneys, ... is dqrived Bell Laboratories Subject: The 1956 Consent Decree date: January 24, 1956 from: Able, Baker & Charley, Attys. The Role of the Allen Wrench in Modern Electronics Plaintiff, United States of America, having filed its com- plaint herein on January 14, 1949 the defendants having appeared and filed their answer (o such compliaint denying the substantive allegations thereof: and the parties. by their attorneys, having severaily consented to the entry of this J. Q. Pencilpusher X. Y. Hardwired Final Judgment. without (riai or adjudication of any issues of fact or law herein and without this Final Judgment con- Beil Laboratories Murray Hill, New Jersey 07974 stituting any evidence or admission by any party in respect of any such issues; Now, therefore before anv testimony has been taken herein, and without trial or adjudication of any issue of fact or law herein, and upon the consent of ail parties hereto. it is hereby ABSTRACT This abstract should be short enough 0 fit on a sirigie psge cover sheet. It must actract the reader into sending {or the compilete memorandum. Ordered, adjudged and decreed as foilows: I. [Sherman Act] This Court has jurisdiction of the subject matter herein and of all the parties hereto. upon which and commerce relief may be The complaint states a claim granted against each of the defendants under Sections |. 2 and 5 of the Act of Congress of July 2, 1890, =nutled "*An act to protect trade April 1, 1976 against unlawful restraints and monopo- lies,”” commonly known as the Sherman Act, as amended. 1. [Definitions] For the purposes of this Final Judgment: (a) **Western’" shail mean the defendant Western Elec- teic Company, [ncorporated. The Role of the Allen Wrench in Modern Electronics J. Q. Pencilpusher Other formats possible (specify before .TL) are: MR (**‘memo for record’’). .MF (“*memo for fiile’"). EG (‘“*engineer’s notes’’) and .TR (Computing Science Tech. Report). X. Y. Hardwired Bell Laboratories Murray Hill, New Jersey 07974 Headings 1. {ntroduction The solution to the torque handle equation 2 F(x,)=Gix) (1) 0 i$ found with the transformation x -%- where p=G ' (x) and 9 is derived from well-known principles. NH SH introduction. Appendix | PP oe text text text text text text 1. Appendix | Introduction text text text lext ext text | A Guide to -ms Multiple Indents A Simple List JP 1. J. Pencilpusher and X. Hardwired, A A New Kind of Set Screw, R | (1976), 23-255. AP 2. H. Nails and R. lrons, R Fasteners for Printed Circuit Boards, R (1974), 23-24. LP (terminates list) line. J. Pencilpusher and X. Hardwired, 4 New Kind of Set Screw, Proc. IEEE 75 (1976), 23-255. H. Nails and R. Irons, Fasteners for Printed Circuit Boards, Proc. ASME 23 (1974), 23-24. Displays text text text text text text DS and now for something completely different DE text text text text text text hoboken harrison newark roseville avenue grove street east orange brick church orange highland avenue mountain station south orange maplewood millburn short hills summit new providernce and now for something completely different murray hill berkeley heights gillette stirling millington lyons basking ridge bernardsville far hills peapack gladstone Options: .DS level item, but somewhat longer. RE P 2. Return to previous value of the indenting at this point. AP 3. Another Proc. ASME B 23 2. This is ordinary text to point out the margins of the page. AP 1. First level item RS P a) Second level. AP b) Continued here with another second Proc. |IEEE B 78 1. 5-15 L: left-adjust; .DS C: line-by-line center; .DS B: make block, then center. This is ordinary text to point out the margins of the page. 1. First level item a) Second level. b) Continued here with another second level 2. Return to previous value of the indenting at this point. Another line. item, but somewhat longer. 3. Keeps Lines bracketed by the following commands are kept together, and will appear entirely on one page: KF may float KS not moved KE in text KE through text Double Column JL The Declaration of independence .2C PP When in the course of human events, it becomes necessary for one people to dissolve the political bonds which have connected them with another, and to assume among the powers of the earth the separate and equal station to which the laws of Nature and of Nature's God entitle Footnotes them, a decent respect to the opinions of The Declaration of Independence Among the most important occupants of the workbench are the long-nosed pliers. Without these basic tools® FS * As first shown by Tiger & Leopard (1975). FE few assemblies could be completed. They may lack the popular appeal of the sledgehammer Among the most important occupants of the workbench are the long-nosed pliers. Without these basic tools® few assembiies could be completed. They may lack the popular appeal of the sledgehammer * Ag first shown by Tiger & Leopard (1975). When in the course of human events, it be- comes necessary for one peopie to dissolve political bonds have connected the which them they should declare the causes which impel them to the separation. We hold these truths sume among the powers to be self-evident, that all men are created equal, that they are endowed by their creator of the earth the separate with and equal station (o which the laws of Nature rights, that among these with another, and to as- and of Nature's God entitle them, a decent respect to the opinions of mankind requires that certain unalienable are life, liberty, and the pursuit of happiness. That to secure these rights, governments are instituted among men, 5-16 A Guide to -ms Tables Equations TS A displayed equation is marked with an equation number at the right margin (® indicates a tab) alibox; by adding an argument to the EQ line: EQ (1.3) CsSSs AT&T Common Stock | g fi ‘; Yeur| Price ;: Dividend EN AT&T Common Stock 1971 '“"f" A displayed equation is marked with an equation 1971 number at the right margin by adding an argument X Sup 2overasup 2 "="sqrt {pzsup 2 +qz+r} Year @ Price ® Dividend to the EQ line: -‘S-; =\ pzidqs r (1.3 a EN V, a 1 4 = A(lD) . . . b -+ o ¢ 2.837 2241-5402.70 4140-33 329 4®40-5303.24 A A LIEEALY 5D45-5203.40 z 1o . el * (first quarter only) o . |18 AQBD] Ly (2.2a) * (first quarter only) The meunings of the key-letters describing the alignment ol each entry are: c center n numerical rright-adjust a subcolumn f left-adjust S spanned The global table options are center, expand, box. doublebox, allbox, tab (x) and linesize (n). TS (with delim SS on. see panel 3) doublebox, center: L ccC EN EQ L Name @ Definition I 1. lineup =" {left ( {partial V] over |partial x} right ) } sup 2 + { left { {partial V} over |partial y} right ) ) sup 2 ~""""" lambda -> inf EN Fly) = |V av] [av]’ “ox | "oy (with delim SS on, see panel 3). See also the equations in the second table, panel 8. Some Registers You Can Change Line length nr LL 7i Paragraph spacing Title length ar LT 7i Page offset .nr PO 0.5 Point size .nrPSH Page heading .ds CH Appendix Vertical spacing nr VS 11 Column width nr CW 3j Intercolumn spacing .nr GW 5§ Margins = head and foot .nr HM .75i ..SP Gamma @ SGAMMA (z) = int sub O sup inf \ t sup {z-1] e sup -t dts Sine®Ssin (x) = 1 over 2i (e sup ix - e sup -ix )3 Error @S roman erf (z) = 2 over sqrt pi \ int sub O sup z e sup {-t sup 2} dts Bessel®S Jsub 0 (2) = 1 over pi \ int sub O sup pi cos ( Z sin theta ) d theta S A ==co Sadots, $bdotdotS, $ xitilde times y vecS: .nr FM .75i 3 6051-59®.95° F hat (chi) " mark = ~|del V|sup 2 Paragraph indent .ar Pl 2n 2.70 3146-33 EQ a, b, Exv. SZ'EO TE 1(2.2a) bold V bar sub nu~="left [ pile (a above b above ¢ ) right] + left [ matrix { col { A(11) above . above . | col {. above . above .} col {. above . above A(33) |} right ] cdot left [ pile { alpha above beta above gamma } right ] - D41-54 DS2.60 3P46-5502.87 2 EQ 2(41-34 | .ar PD 0 (center) ~.ds RH 7-25-76 (right) .ds LH Private (left) Page footer .ds CF Draft ds LF ds RE . . similar Page numbers nr% 3 Zeta @S zeta (s) =\ sum from k=1 to inf k sup -s ~"( Re’s > 1)S TE Name Definition Gamma I‘(:)-fo t:=le =" dr Sine sin(x)-«%;(e"’-e"") Error erf(:)--}..;‘!:):e"zdr Bessel JO('—)"}{fo cos(zsind)d@ Zeta {(s)=3 k= (Res>1) { | Usage “ Documents with just text: troff -ms files With equations only: eqn files| troff -ms With tables only: tbl files | troff -ms With both tables and equuations: tbl files{egn!troff -ms The above generates STARE output on GCOS: replace -8t with —=ph for tvpesetter output. A Revised Version of -ms 5-17 A Revised Version of -ms Bill Tuthull Computing Services University of California Berkeley, CA 94720 The -ms macros have been slightly revised and rearranged. Because of the rearrangement, the new macros can be read by the computer in about half the time required by the previous version of -ms. This means that output will begin to appear between ten seconds and several minutes more quickly, depending on the system load. On long files, however, the savings in total time are not substantial. The old version of -ms is still available as-mos. Several bugs in -ms have been fixed, including a bad problem with the .1C macro, minor difficulties with boxed text, a break induced by .EQ before initialization, the failure to set tab stops in displays, and several bothersome errors in the refer macros. Macros used only at Bell Laboratories have been removed. There are a few extensions to previous -ms macros, and a number of new macros, but all the documented -ms macros still work exactly as they did before, and have the same names as before. Output produced with -ms should look like output produced with—mos. One important new feature is automatically numbered footnotes. Footnote numbers are printed by means of a pre-defined string (\**), which you invoke separately from .FS and .FE. Each time it is used, this string increases the footnote number by one, whether or not you use FS and .FE in your text. Footnote numbers will be superscripted on the phototypesetter and on daisy-wheel terminals, but on low-resolution devices (such as the lpr and a crt), they will be bracketed. If you use )Y** to indicate numbered footnotes, then the .FS macro will automatically include the footnote number at the bottom of the page. This footnote, for example, was produced as follows:! This footnote, for example, was produced as follows: \** FS | FE If you are using }** to number footnotes, but want a particular footnote to be marked with an asterisk or a dagger, then give that mark as the first argument to .FS: T then give that mark as the first argument to .FS: \(dg FS \(dg FE Footnote numbering will be temporarily suspended, because the \** string is not used. Instead of a dagger, you could use an asterisk * or double dagger i, represented as \(dd. I If you never use the “\**” string, no footnote numbers will appear anywhere in the text, including down here. The output footnotes will look exactly like footnotes produced with -mos. t In the footnote, the dagger will appear where the footnote number would otherwise appear, as on the left. 5-18 A Revised Version of -ms Another new feature is a macro for printing theses according to Berkeley standards. This macro is called .'TM, which stands for thesis mode. (It is much like the .th macro in ‘me.) It will put page numbers in the upper right-hand corner; number the first page; suppress the date; and doublespace everything except quotes, displays, and keeps. each file making up your thesis. Use it at the top of Calling .TM defines the .CT macro for chapter titles, which skips to a new page and moves the pagenumber to the center footer. The .P1 (P one) macro can be used even without thesis mode to print the header on page 1, which is suppressed except in thesis mode. If you want roman numeral page numbering, use an ‘“.af PN i” request. There is a new macro especially for bibliography entries, called .XP, which stands for exdented paragraph. It will exdent the first line of the paragraph by\n(PI units, usually 5n (the same as the indent for the first line of a .PP). Most bibliographies are printed this way. Here are some examples of exdented paragraphs: Lumley, Lyle S., Sex in Crustaceans: Shell Fish Habits, Harbinger Press, Tampa Bay and San Diego, October 1979. 243 pages. The pioneering work in this field. Leffadinger, Harry A., “Mollusk Mating Season: 52 Weeks, or All Year?” in Acta Biologica, vol. 42, no. 11, November 1980. A provocative thesis, but the conclusions are wrong. Of course, you will have to take care of italicizing the book title and journal, and quoting the title of the journal article. Indentation or exdentation can be changed by setting the value of number register PI. If you need to produce endnotes rather than footnotes, put the references in a file of their own. This is similar to what you would do if you were typing the paper on a conven- tional typewriter. Note that you can use automatic footnote numbering without actually hav- ing .FS and .FE pairs in your text. If you place footnotes in a separate file, you can use .IP macros with \** as a hanging tag; this will give you numbers at the left-hand margin. With some styles of endnotes, you would want to use .PP rather then .IP macros, and specify \'* before the reference begins. There are four new macros to help produce a table of contents. Table of contents entries must be enclosed in . XS and .XE pairs, with optional . XA macros for additional entries; arguments to .XS and .XA specify the page number, to be printed at the right. final .PX macro prints out the table of contents. A Here is a sample of typical input and output text: XS i Introduction XA 1 Chapter 1: Review of the Literature XA 23 Chapter 2: Experimental Evidence XE PX Table of Contents INEPOAUCTION ot e et e e e abe s e e eesaaasbateeeeesessareseeesenaneeas ii Chapter 1: Review of the Literature ......ccccoeiieeiiiiiiiiiieieeeeeeeeeee et eeeeeeeeeeeeeene 1 Chapter 2: Experimental Evidence ....couoviivviieiiiiiiiiiieeeeeeeeeeeeeeee et eeveeeesee e 23 The .XS and .XE pairs may also be used in the text, after a section header for instance, in which case page numbers are supplied automatically. However, most documents that require a table of contents are too long to produce in one run, which is necessary if this method is to work. It is recommended that you do a table of contents after finishing your document. print out the table of contents, use the .PX macro; if you forget it, nothing will happen. To A Revised Version of -ms 5-19 As an aid in producing text that will format correctly with both nroff and troff, there are some new string definitions that define quotation marks and dashes for each of these two formatting programs. The \* string will yield two hyphens in nroff, but in troff it will pro- duce an em dash— like this one. The \*Q and \*U strings will produce “ and ” in troff, but ” in nroff. (In typesetting, the double quote is traditionally considered bad form.) There are now a large number of optional foreign accent marks defined by the-ms mac- ros. All the accent marks available in—mos are present, and they all work just as they always did. However, there are better definitions available by placing .AM at the beginning of your document. Unlike the—mos accent marks, the accent strings should come after the letter being accented. Here is a list of the diacritical marks, with examples of what they look like. name of accent acute accent input e\*’ output e’ grave accent e\*' e’ circumflex cedilla question exclamation o\*~ c\*, n\*. \*? \*! o c, n” umlaut digraph s hacek u\*: \*8 c\rv u” macron underdot o-slash angstrom a\* s\*. o\*/ a\o a S 0 a knit tilde yogh kni\*3t Thorn thorn \*(Th \*(th Eth c \*(D- eth \*(d- hooked o ae ligature AE ligature \*q \*(ae \*(Ae oe ligature \*(oe OE ligature \*(Oe If you want to use these new diacritical marks, don’t forget the .AM at the top of your file. Without it, some will not print at all, and others will be placed on the wrong letter. It is also possible to produce custom headers and footers that are different on even and odd pages. The .OH and .EH macros define odd and even headers, while .OF and .EF define odd and even footers. Arguments to these four macros are specified as with .tl. This document was produced with: .OH '\\fIThe -mx Macros”“Page %\\fP’ EH "\fIPage %”“The -mx Macros\fP’ Note that it would be a error to have an apostrophe in the header text; if you need one, you will have to use a different delimiter around the left, center, and right portions of the title. You can use any character as a delimiter, provided it doesn’t appear elsewhere in the argument to .OH, .EH, .OF, or EF. The—ms macros work in conjunction with the tbl, eqn, and refer preprocessors. Macros to deal with these items are read in only as needed, as are the thesis macros (. TM), the special accent mark definitions (.AM), table of contents macros (.XS and .XE), and macros to 5-20 A Revised Version of -ms format the optional cover page. The code for the ms package lives in /usr/lib/tmac/tmac.s, and sourced files reside in the directory /usr/ucb/lib/ms. Writing Papers with -me 5-21 WRITING PAPERS WITH NROFF USING —ME Eric P. Allman Electronics Research Laboratory University of California, Berkeley Berkeley, California 94720 This document describes the text processing facilities available on the UNIXt operating system via NROFFT and the —me macro package. It is assumed that the reader already is gen- erally familiar with the UNIX operating system and a text editor such as ex. This is intended to be a casual introduction, and as such not all material is covered. In particular, many varia- tions and additional features of the —me macro package are not explained. For a complete discussion of this and other issues, see The —me Reference Manual and The NROFF/TROFF Reference Manual. NROFF, a computer program that runs on the UNIX operating system, reads an input file prepared by the user and outputs a formatted paper suitable for publication or framing. The input consists of text, or words to be printed, and requests, which give instructions to the NROFF program telling how to format the printed copy. Section 1 describes the basics of text processing. Section 3 introduces displays. Section 2 describes the basic requests. Annotations, such as footnotes, are handled in section 4. more complex requests which are not discussed in section 2 are covered in section 5. section 6 discusses things you will need to know if you want to typeset documents. The Finally, If you are a novice, you probably won’t want to read beyond section 4 until you have tried some of the basic features out. When you have your raw text ready, call the NROFF formatter by typing as a request to the UNIX shell: nroff —me —Ttype files where type describes the type of terminal you are outputting to. Common values are dte for a DTC 300s (daisy-wheel type) printer and lpr for the line printer. If the —T flag is omitted, a “lowest common denominator” terminal is assumed; this is good for previewing output on most terminals. A complete description of options to the NROFF command can be found in The NROFF/TROFF Reference Manual. The word argument is used in this manual to mean a word or number which appears on the same line as a request which modifies the meaning of that request. request For example, the | .Sp spaces one line, but Sp 4 spaces four lines. The number 4 is an argument to the .sp request which says to space four 5-22 Writing Papers with -me lines instead of one. Arguments are separated from the request and from each other by spaces. 1. Basics of Text Processing The primary function of NROFF is to collect words from input lines, fill output lines with those words, justify the right hand margin by inserting extra spaces in the line, and output the result. For example, the input: Now is the time for all good men to come to the aid of their party. Four score and seven years ago,... will be read, packed onto output lines, and justified to produce: Now is the time for all good men to come to the aid of their party. Four score and seven years ago,... Sometimes you may want to start a new output line even though the line you are on is not yet full; for example, at the end of a paragraph. starts a new output line. To do this you can cause a break, which Some requests cause a break automatically, as do blank input lines and input lines beginning with a space. Not all input lines are text to be formatted. which describe how to format the text. Some of the input lines are requests Requests always have a period or an apostrophe (““"”) as the first character of the input line. The text formatter also does more complex things, such as automatically numbering pages, skipping over page folds, putting footnotes in the correct place, and so forth. I can offer you a few hints for preparing text for input to NROFF. input lines short. First, keep the Short input lines are easier to edit, and NROFF will pack words onto longer lines for you anyhow. In keeping with this, it is helpful to begin a new line after every period, comma, or phrase, since common corrections are to add or delete sentences or phrases. Second, do not put spaces at the end of lines, since this can sometimes confuse the NROFF processor. Third, do not hyphenate words at the end of lines (except words that should have hyphens in them, such as “mother-in-law”); NROFF is smart enough to hyphenate words for you as needed, but is not smart enough to take hyphens out and join a word back together. Also, words such as “mother-in-law” should not be broken over a line, since then you will get a space where not wanted, such as “mother- in-law”. 2. Basic Requests 2.1. Paragraphs Paragraphs are begun by using the .pp request. For example, the input: Now is the time for all good men to come to the aid of their party. Four score and seven years ago,... produces a blank line followed by an indented first line. The result is: TUNIX, NROFF, and TROFF are Trademarks of Bell Laboratories Writing Papers with -me 5-23 Now 1is the time for all good men to come to the aid of their party. Four score and seven years ago,... Notice that the sentences of the paragraphs must not begin with a space, since blank lines and lines begining with spaces cause a break. For example, if I had typed: -pPp Now 1is the time for all good men to come to the aid of their party. Four score and seven years ago,... The output would be: Now is the time for all good men to come to the aid of their party. Four score and seven years ago,... A new line begins after the word “men” because the second line began with a space character. There are many fancier types of paragraphs, which will be described later. 2.2. Headers and Footers Arbitrary headers and footers can be put at the top and bottom of every page. Two requests of the form .he title and .fo title define the titles to put at the head and the foot of every page, respectively. The titles are called three-part titles, that is, there is a left-justified part, a centered part, and a right-justified part. To separate these three parts the first character of title (whatever it may be) is used as a delimiter. Any character may be used, but backslash and double quote marks should be avoided. The percent sign is replaced by the current page number whenever found in the title. For example, the input: .he //%// fo “Jane Jones”"My Book” results in the page number centered at the top of each page, “Jane Jones” in the lower left corner, and “My Book” in the lower right corner. 2.3. Double Spacin‘g‘ NROFF will double space output text automatically if you use the request .Is 2, as is done in this section. You can revert to single spaced mode by typing .Is 1. 2.4. Page Layout A number of requests allow you to change the way the printed copy looks, sometimes called the layout of the output page. Most of these requests adjust the placing of “white space” (blank lines or spaces). In these explanations, characters in italics should be replaced with values you wish to use; bold characters represent characters which should actually be typed. The .bp request starts a new page. The request .sp N leaves N lines of blank space. N can be omitted (meaning skip a single line) or can be of the form Ni (for N inches) or Ne¢ (for N centimeters). For example, the input: .sp 1.51 My thoughts on the subject .Sp leaves one and a half inches of space, followed by the line “My thoughts on the 5-24 Writing Papers with -me subject”, followed by a single blank line. The .in +N request changes the amount of white space on the left of the page The argument N can be of the form +N (meaning leave N spaces more (the indent). than you are already leaving), —N (meaning leave less than you do now), or just N (meaning leave exactly N spaces). N can be of the form Ni or Ne also. For example, the input: initial text in 5 some text in +1i more text an —2¢ final text produces “some text” indented exactly five spaces from the left margin, “more text” indented five spaces plus one inch from the left margin (fifteen spaces on a pica type- writer), and “final text” indented five spaces plus one inch minus two centimeters from the margin. That is, the output is: initial text some text more text final text The .ti +N (temporary indent) request is used like .in +N when the indent should apply to one line only, after which it should revert to the previous indent. For example, the input: an 1i i 0 Ware, James R. The Best of Confucius, Halcyon House, 1950. An excellent book containing translations of most of Confucius” most delightful sayings. A definite must for anyone interested in the early foundations of Chinese philosophy. produces: Ware, James R. The Best of Confucius, Halcyon House, 1950. An excellent book containing translations of most of Confucius’ most delightful sayings. A definite must for anyone interested in the early foundations of Chinese philosophy. Text lines can be centered by using the .ce request. tered (horizontally) on the page. The line after the .ce is cen- To center more than one line, use .ce the number of lines to center), followed by the N lines. lines but don’t want to count them, type: N (where N is If you want to center many .ce 1000 lines to center .ce 0 The .ce O request tells NROFF to center zero more lines, in other words, stop centering. All of these requests cause a break; that is, they always start a new line. want to start a new line without performing any other action, use .br. If you Writing Papers with -me 5-25 2.5. Underlining Text can be underlined using the .ul request. The .ul request causes the next input line to be underlined when output. You can underline multiple lines by stating a count of input lines to underline, followed by those lines (as with the .ce request). For example, the input: .l 2 Notice that these two input lines are underlined. will underline those eight words in NROFF. (In TROFF they will be set in italics.) 3. Displays | Displays are sections of text to be set off from the body of the paper. Major quotes, tables, and figures are types of displays, as are all the examples used in this document. All displays except centered blocks are output single spaced. 3.1. Major Quotes Major quotes are quotes which are several lines long, and hence are set in from the rest of the text without quote marks around them. These can be generated using the commmands .(q and .)q to surround the quote. For example, the input: As Weizenbaum points out: (q It is said that to explain is to explain away. This maxim is nowhere so well fulfilled as in the areas of computer programming,... )q generates as output: As Weizenbaum points out: It is said that to explain is to explain away. This maxim is nowhere so well fulfilled as in the areas of computer programming,... 3.2. Lists A list is an indented, single spaced, unfilled display. Lists should be used when the material to be printed should not be filled and justified like normal text, such as columns of figures or the examples used in this paper. Lists are surrounded by the requests .(1 and .)l. For example, type: Alternatives to avoid deadlock are: 1 Lock in a specified order Detect deadlock and back out one process Lock all resources needed before proceeding Il will produce: Alternatives to avoid deadlock are: Lock in a specified order Detect deadlock and back out one process Lock all resources needed before proceeding 5-26 Writing Papers with -me 3.3. Keeps A keep is a display of lines which are kept on a single page if possible. An example of where you would use a keep might be a diagram. Keeps differ from lists in that lists may be broken over a page boundary whereas keeps will not. Blocks are the basic kind of keep. the request .)b. They begin with the request .(b and end with If there is not room on the current page for everything in the block, a new page is begun. of the page. keeps. This has the unpleasant effect of leaving blank space at the bottom When this is not appropriate, you can use the alternative, called floating Floating keeps move relative to the text. Hence, they are good for things which will be referred to by name, such as “See figure 3”. A floating keep will appear at the bottom of the current page if it will fit; otherwise, it will appear at the top of the next page. Floating keeps begin with the line .(z and end with the line .)z. For an example of a floating keep, see figure 1. The .hl request is used to draw a horizontal line so that the figure stands out from the text. 3.4. Fancier Displays Keeps and lists are normally collected in nofill mode, so that they are good for If you want a display in fill mode (for text), type .(1 F (Throughout tables and such. this section, comments applied to .(1 also apply to .(b and .(z). will be indented from both margins. For example, the input: (AF And now boys and girls, a newer, bigger, better toy than ever before! Be the first on your block to have your own computer! Yes kids, you too can have one of these modern data processing devices. You too can produce beautifully formatted papers without even batting an eye! )l will be output as: Az hl Text of keep to be floated. .Sp .Ce Figure 1. Example of a Floating Keep. hl Y Figure 1. Example of a Floating Keep. This kind of display Writing Papers with -me 5-27 And now boys and girls, a newer, bigger, better toy than ever before! Be the first on your block to have your own computer! Yes kids, you too can have one of these modern data processing devices. You too can produce beautifully formatted papers without even batting an eye! Lists and blocks are also normally indented (floating keeps are normally left justified). To get a left-justified list, type .(1 L. To get a list centered line-for-line, type .(1 C. For example, to get a filled, left justified list, enter: (ALF text of block Il The input: .1 first line of unfilled display more lines Il produces the indented text: first line of unfilled display more lines Typing the character L after the .(l request produces the left justified result: first line of unfilled display more lines Using C instead of L produces the line-at-a-time centered output: first line of unfilled display more lines Sometimes it may be that you want to center several lines as a group, rather than centering them one line at a time. To do this use centered blocks, which are surrounded by the requests .(c and .)e. All the lines are centered as a unit, such that the longest line is centered and the rest are lined up around that line. Notice that lines do not move relative to each other using centered blocks, whereas they do using the C argument to keeps. Centered blocks are not keeps, and may be used in conjunction with keeps. example, to center a group of lines as a unit and keep them on one page, use: For Ab L c first line of unfilled display more lines Jc first line of unfilled display more lines If the block requests (.(b and .)b) had been omitted the result would have been the same, but with no guarantee that the lines of the centered block would have all been on one page. Note the use of the L argument to .(b; this causes the centered block to center within the entire line rather than within the line minus the indent. Also, the center requests must be nested inside the keep requests. 5-28 Writing Papers with -me 4. Annotations There are a number of requests to save text for later printing. at the bottom of the current page. Footnotes are printed Delayed text is intended to be a variant form of foot- note; the text is printed only when explicitly called for, such as at the end of each chapter. Indexes are a type of delayed text having a tag (usually the page number) attached to each entry after a row of dots. Indexes are also saved until called for explicitly. 4.1. Footnotes Footnotes begin with the request .(f and end with the request .)f. The current footnote number is maintained automatically, and can be used by typing \**, to pro- duce a footnote number'. note. The number is automatically incremented after every foot- For example, the input: (q A man who is not upright and at the same time is presumptuous; one who is not diligent and at the same time is ignorant; one who is untruthful and at the same time is incompetent; such men I do not count among acquaintances.\** Af \**James R. Ware, .ul The Best of Confucius, Halcyon House, 1950. Page 77. Of )q generates the result: A man who is not upright and at the same time is presumptuous; one who is not dili- gent and at the same time is ignorant; one who is untruthful and at the same time is in- competent; such men I do not count among acquaintances.” It is important that the footnote appears inside the quote, so that you can be sure that the footnote will appear on the same page as the quote. 4.2. Delayed Text Delayed text is very similar to a footnote except that it is printed when called for explicitly. This allows a list of references to appear (for example) at the end of each chapter, as is the convention in some disciplines. Use \*# on delayed text instead of \** as on footnotes. If you are using delayed text as your standard reference mechanism, you can still use footnotes, except that you may want to reference them with special characters* rather than numbers. 4.3. Indexes An “index” (actually more like a table of contents, since the entries are not sorted alphabetically) resembles delayed text, in that it is saved until called for. However, each entry has the page number (or some other tag) appended to the last line of the 'Like this. ‘James R. Ware, The Best of Confucius, Halcyon House, 1950. Page 77. *Such as an asterisk. Writing Papers with -me 5-29 index entry after a row of dots. Index entries begin with the request .(x and end with .)x. The .)x request may have a argument, which is the value to print as the “page number”. current page number. It defaults to the If the page number given is an underscore (“ ”’) no page number or line of dots is printed at all. To get the line of dots without a page number, type .)x ”” which specifies an explicitly null page number. The .xp request prints the index. For example, the input: Ax Sealing wax )X Ax Cabbages and kings )X A(x Why the sea is boiling hot Jx 2.5a Ax Whether pigs have wings Jx Ax This is a terribly long index entry, such as might be used for a list of illustrations, tables, or figures; I expect it to take at least two lines. )X XP generates: SEALINE WAX ..uviiiiiiiiiiiiniiiniiecieicteintereeeeesrreseseeeesesessasessseessssessssssstossssesstessseesssasesssessnns 29 Cabbages and kings Why the sea is BoIling Mot .......coivviiiiiiiireeiirr ettt ste e se s 2.5a Whether pigs have WINES .....cccciviiiiiiiiiiiiieeecirecnieecnee e esseressssecesseesessessssssesssssesns This is a terribly long index entry, such as might be used for a list of illustra- tions, tables, or figures; I expect it to take at least two lines. ......ccceeeuvunn..... 29 The .(x request may have a single character argument, specifying the “name” of the index; the normal index is x. Thus, several “indicies” may be maintained simul- taneously (such as a list of tables, table of contents, etc.). Notice that the index must be printed at the end of the paper, rather than at the beginning where it will probably appear (as a table of contents); the pages may have to be physically rearranged after printing. 5. Fancier Features A large number of fancier requests exist, notably requests to provide other sorts of paragraphs, numbered sections of the form 1.2.3 (such as used in this document), and multicolumn output. 5.1. More Paragraphs Paragraphs generally start with a blank line and with the first line indented. It is possible to get left-justified block-style paragraphs by using .lp instead of .pp, as demonstrated by the next paragraph. 5-30 Writing Papers with -me Sometimes you want to use paragraphs that have the body indented, and the first line exdented (opposite of indented) with a label. This can be done with the Ap request. A word specified on the same line as .ip is printed in the margin, and the body is lined up at a prespecified position (normally five spaces). For example, the input: ip one This is the first paragraph. Notice how the first line of the resulting paragraph lines up with the other lines in the paragraph. Ap two And here we are at the second paragraph already. You may notice that the argument to .ip appears in the margin. Ip We can continue text... produces as output: one This is the first paragraph. Notice how the first line of the resulting paragraph lines up with the other lines in the paragraph. two And here we are at the second paragraph already. ment to .Ip appears in the margin. You may notice that the argu- We can continue text without starting a new indented paragraph by using the .lp request. If you have spaces in the label of a .ip request, you must use an “unpaddable space” instead of a regular space. This is typed as a backslash character (“\”) followed by a space. For example, to print the label “Part 1”, enter: ip "Part\1” If a label of an indented paragraph (that is, the argument to .ip) is longer than the space allocated for the label, .ip will begin a new line after the label. For example, the input: 1p longlabel This paragraph had a long label. The first character of text on the first line will not line up with the text on second and subsequent lines, although they will line up with each other. will produce: longlabel This paragraph had a long label. The first character of text on the first line will not line up with the text on second and subsequent lines, although they will line up with each other. It is possible to change the size of the label by using a second argument which is the size of the label. ing: For example, the above example could be done correctly by say- .1p longlabel 10 which will make the paragraph indent 10 spaces for this paragraph only. many paragraphs to indent all the same amount, use the number register ii. ple, to leave one inch of space for the label, type: If you have For exam- Writing Papers with -me 5-31 ar il 1i somewhere before the first call to .ip. Refer to the reference manual for more information. If .ip is used with no argument at all no hanging tag will be printed. For example, the input: ip [a] This is the first paragraph of the example. We have seen this sort of example before. Ap This paragraph is lined up with the previous paragraph, but it has no tag in the margin. produces as output: [a] This is the first paragraph of the example. We have seen this sort of example before. This paragraph is lined up with the previous paragraph, but it has no tag in the margin. A special case of .ip is .np, which automatically numbers paragraphs sequentially from 1. The numbering is reset at the next .pp, .Ip, or .sh (to be described in the next section) request. For example, the input: .np This is the first point. .np This is the second point. Points are just regular paragraphs which are given sequence numbers automatically by the .np request. .Pp This paragraph will reset numbering by .np. .np For example, we have reverted to numbering from one now. generates: (1) This is the first point. (2) This is the second point. Points are just regular paragraphs which are given sequence numbers automatically by the .np request. This paragraph will reset numbering by .np. (1) For example, we have reverted to numbering from one now. 5.2. Section Headings Section numbers (such as the ones used in this document) can be automatically generated using the .sh request. You must tell .sh the depth of the section number and a section title. The depth specifies how many numbers are to appear (separated by decimal points) in the section number. For example, the section number 4.2.5 has a depth of three. Section numbers are incremented in a fairly intuitive fashion. If you add a number (increase the depth), the new number starts out at one. If you subtract section numbers (or keep the same number) the final number is incremented. For example, the input: 5-32 Writing Papers with -me .sh 1 ”The Preprocessor” .sh 2 ”Basic Concepts” .sh 2 ”Control Inputs” .sh 3 .sh 3 .sh 1 ”Code Generation” .sh 3 produces as output the result: 1. The Preprocessor 1.1. Basic Concepts 1.2. Control Inputs 1.2.1. 1.2.2. 2. Code Generation 2.1.1. You can specify the section number to begin by placing the section number after the section title, using spaces instead of dots. For example, the request: .sh 3 "Another section” 7 3 4 will begin the section numbered 7.3.4; all subsequent .sh requests will number relative to this number. There are more complex features which will cause each section to be indented pro- portionally to the depth of the section. For example, if you enter: anr si N each section will be indented by an amount N. N must have a scaling factor attached, that is, it must be of the form Nx, where x is a character telling what units N is in. Common values for x are i for inches, ¢ for centimeters, and n for ens (the width of a single character). For example, to indent each section one-half inch, type: ar si 0.51 After this, sections will be indented by one-half inch per level of depth in the section number. For example, this document was produced using the request .nr si 3n at the beginning of the input file, giving three spaces of indent per section depth. Section headers without automatically generated numbers can be done using: .uh "Title” which will do a section heading, but will put no number on the section. 5.3. Parts of the Basic Paper There are some requests which assist in setting up papers. The .tp request initializes for a title page. There are no headers or footers on a title page, and unlike other pages you can space down and leave blank space at the top. For example, a typical title page might appear as: Writing Papers with -me 5-33 Ap .Sp 2i (1C THE GROWTH OF TOENAILS IN UPPER PRIMATES Sp by .Sp Frank N. Furter Il .bp The request .th sets up the environment of the NROFF processor to do a thesis, using the rules established at Berkeley. It defines the correct headers and footers (a page number in the upper right hand corner only), sets the margins correctly, and double spaces. The .+¢ T request can be used to start chapters. Each chapter is automatically numbered from one, and a heading is printed at the top of each chapter with the chapter number and the chapter name T. For example, to begin a chapter called “Con- clusions”, use the request: .+c¢ "CONCLUSIONS” which will produce, on a new page, the lines CHAPTER 5 CONCLUSIONS with appropriate spacing for a thesis. Also, the header is moved to the foot of the page on the first page of a chapter. Although the .+c¢ request was not designed to work only with the .th request, it is tuned for the format acceptable for a PhD thesis at Berkeley. If the title parameter T is omitted from the .+¢ request, the result is a chapter with no heading. This can also be used at the beginning of a paper; for example, .+¢ was used to generate page one of this document. Although papers traditionally have the abstract, table of contents, and so forth at the front of the paper, it is more convenient to format and print them last when using NROFF. This is so that index entries can be collected and then printed for the table of contents (or whatever). At the end of the paper, issue the .++ P request, which begins the preliminary part of the paper. After issuing this request, the .+c¢ request will begin a preliminary section of the paper. Most notably, this prints the page number restarted from one in lower case Roman numbers. .+¢ may be used repeatedly to begin different parts of the front material for example, the abstract, the table of contents, acknowledgments, list of illustrations, etc. The request .++ B may also be used to begin the bibliographic section at the end of the paper. outlined in figure 2. For example, the paper might appear as (In this figure, comments begin with the sequence \”.) 5.4. Equations and Tables Two special UNIX programs exist to format special types of material. neqn set equations for the phototypesetter and NROFF respectively. print extremely pretty tables in a variety of formats. Eqn and Tbl arranges to This document will only describe the embellishments to the standard features; consult the reference manuals for those processors for a description of their use. The eqn and neqn programs are described fully in the document Typesetting Mathematics — Users’ Guide by Brian W. Kernighan and Lorinda L. Cherry. 5-34 Writing Papers with -me .th \” set for thesis mode ‘ \’ define footer for each page V' begin title page \" center a large block fo "DRAFT” Ap (1C THE GROWTH OF TOENAILS IN UPPER PRIMATES .Sp by .Sp Frank Furter \” end centered part b/ )l +c¢ INTRODUCTION Ax t Introduction \" begin chapter named "INTRODUCTION” \’ make an entry into index ‘t’ V" end of index entry ) text of chapter one +c¢ "NEXT CHAPTER” (x t \” begin another chapter \” enter into index ‘t’ again Next Chapter )X text of chapter two +c¢ CONCLUSIONS Ax t Conclusions )X text of chapter three \” begin bibliographic information 4+ B .+c¢ BIBLIOGRAPHY \” begin another ‘chapter’ Ax t Bibliography )X text of bibliography \r begin preliminary material 4+ P .+c "TABLE OF CONTENTS” Xpt .+c PREFACE \” print index ‘t’ collected above \” begin another preliminary section text of preface Figure 2. Outline of a Sample Paper Equations are centered, and are kept on one page. They are introduced by the .EQ request and terminated by the .EN request. The .EQ request may take an equation number as an optional argument, which is printed vertically centered on the right hand side of the equation. becomes too long it should be split between two lines. To do this, type: If the equation Writing Papers with -me 5-35 EQ (eq 34) text of equation 34 EN C EQ continuation of equation 34 .EN The C on the .EN request specifies that the equation will be continued. The tbl program produces tables. It is fully described (including numerous exam- ples) in the document Tbl — A Program to Format Tables by M. E. Lesk. Tables begin with the .T'S request and end with the .TE request. gle page. Tables are normally kept on a sin- If you have a table which is too big to fit on a single page, so that you know it will extend to several pages, begin the table with the request .TS H and put the request .TH after the part of the table which you want duplicated at the top of every page that the table is printed on. For example, a table definition for a long table might look like: TS H cCsS nnn. THE TABLE TITLE .TH text of the table TE 5.5. Two Column Output You can get two column output automatically by using the request .2¢. This causes everything after it to be output in two-column form. The request .be will start a new column; it differs from .bp in that .bp may leave a totally blank column when it starts a new page. To revert to single column output, use .1c. 5.6. Defining Macros A macro is a collection of requests and text which may be used by stating a simple request. Macros begin with the line .de xx (where xx is the name of the macro to be defined) and end with the line consisting of two dots. After defining the macro, stating the line .xx is the same as stating all the other lines. For example, to define a macro that spaces 3 lines and then centers the next input line, enter: .de SS .Sp 3 .ce and use it by typing: SS Title Line (beginning of text) Macro names may be one or two characters. In order to avoid conflicts with names in —me, always use upper case letters as names. TS, TH, TE, EQ, and EN. The only names to avoid are | 5.7. Annotations Inside Keeps Sometimes you may want to put a footnote or index entry inside a keep. For example, if you want to maintain a “list of figures” you will want to do something like: 5-36 Writing Papers with -me Az c text of figure e .Cée Figure 5. Ax f Figure 5 )X )z which you may hope will give you a figure with a label and an entry in the index f (presumably a list of figures index). Unfortunately, the index entry is read and inter- preted when the keep is read, not when it is printed, so the page number in the index is likely to be wrong. The solution is to use the magic string \! at the beginning of all the lines dealing with the index. In other words, you should use: Az c Text of figure Je .ce Figure 5. \l.(x f \!Figure 5 \!)x )Z which will defer the processing of the index until the figure is output. This will guarantee that the page number in the index is correct. The same comments apply to blocks (with .(b and .)b) as well. 6. TROFF and the Photosetter With a little care, you can prepare documents that will print nicely on either a regular terminal or when phototypeset using the TROFF formatting program. 6.1. Fonts A font is a style of type. There are three fonts that are available simultaneously, Times Roman, Times Italic, and Times Bold, plus the special math font. font is Roman. The normal Text which would be underlined in NROFF with the .ul request is set in italics in TROFF. There are ways of switching between fonts. Roman, italic, and bold fonts respectively. The requests .r, .i, and .b switch to You can set a single word in some font by typing (for example): .1 word which will set word in italics but does not affect the surrounding text. In NROFF, italic and bold text is underlined. Notice that if you are setting more than one word in whatever font, you must sur- round that word with double quote marks (‘”’) so that it will appear to the NROFF pro- cessor as a single word. The quote marks will not appear in the formatted text. If you do want a quote mark to appear, you should quote the entire string (even if a single word), and use two quote marks where you want one to appear. want to produce the text: For example, if you Writing Papers with -me 5-37 "Master Control” in italics, you must type: i ”””Master Control\!””” The\| produces a very narrow space so that the (‘1” does not overlap the quote sign in TROFF, like this: "Master Control” There are also several “pseudo-fonts” available. The input: (b .u underlined .bi ”bold italics” .bx "words in a box” )b generates underlined bold italics [words in a box ! In NROFF these all just underline the text. Notice that pseudo font requests set only the single parameter in the pseudo font; ordinary font requests will begin setting all text in the special font if you do not provide a parameter. No more than one word should appear with these three font requests in the middle of lines. This is because of the way TROFF justifies text. For example, if you were to issue the requests: .bi ”some bold italics” and .bx "words in a box” in the middle of a line TROFF would produce smme bl widloss and \words in 2 box., which I think you will agree does not look good. The second parameter of all font requests is set in the original font. For example, the font request: .b bold face generates “bold” in bold font, but sets “face” in the font of the surrounding text, resulting in: boldface. To set the two words bold and face both in bold face, type: .b ”bold face” You can mix fonts in a word by using the special sequence \¢ at the end of a line to indicate ‘“continue text processing’; this allows input lines to be joined together without a space inbetween them. For example, the input: .u under \¢ .1 1talics generates underitalics, but if we had typed: .u under .1 italics the result would have been under italics as two words. 5-38 Writing Papers with -me 6.2. Point Sizes The phototypesetter supports different sizes of type, measured in points. default point size is 10 points for most text, 8 points for footnotes. The To change the pointsize, type: sz +N where N is the size wanted in points. The vertical spacing (distance between the bottom of most letters (the baseline) between adjacent lines) is set to be proportional to the type size. Warning: changing point sizes on the phototypesetter is a slow mechanical operation. Size changes should be considered carefully. 6.3. Quotes It is conventional when using the typesetter to use pairs of grave and acute accents to generate double quotes, rather than the double quote character (‘”’). This is because it looks better to use grave and acute accents; for example, compare "quote” to “quote”. In order to make quotes compatible between the typesetter and terminals, you may use the sequences \*(lg and \*(rq to stand for the left and right quote respectively. These both appear as ” on most terminals, but are typeset as “ and ” respec- tively. For example, use: | \*(1gSome things aren’t true even if they did happen.\*(rq to generate the result: “Some things aren’t true even if they did happen.” As a shorthand, the special font request: .q "quoted text” will generate “quoted text”. Notice that you must surround the material to be quoted with double quote marks if it is more than one word. Acknowledgments I would like to thank Bob Epstein, Bill Joy, and Larry Rowe for having the courage to use the —me macros to produce non-trivial papers during the development stages; Ricki Blau, Pamela Humphrey, and Jim Joyce for their help with the documentation phase; and the plethora of people who have contributed ideas and have given support for the project. -me Reference Manual 5-39 —ME REFERENCE MANUAL Release 1.1/25 Eric P. Allman Electronics Research Laboratory University of California, Berkeley Berkeley, California 94720 This document describes in extremely terse form the features of the —me macro package for version seven NROFF/TROFF. Some familiarity is assumed with those programs, specifically, the reader should understand breaks, fonts, pointsizes, the use and definition of number registers and strings, how to define macros, and scaling factors for ens, points, v’s (vertical line spaces), etc. For a more casual introduction to text processing using NROFF, refer to the document Writing Papers with NROFF using —me. There are a number of macro parameters that may be adjusted. font number only. Fonts may be set to a In NROFF font 8 is underlined, and is set in bold font in font 3, bold in TROFF, is not underlined in NROFF). surrounding text is used instead. are simulated by the macros. TROFF (although Font 0 is no font change; the font of the Notice that fonts 0 and 8 are “pseudo-fonts”; that is, they This means that although it is legal to set a font register to zero or eight, it is not legal to use the escape character form, such as: \f8 All distances are in basic units, so it is nearly always necessary to use a scaling factor. For example, the request to set the paragraph indent to eight one-en spaces is: .nr pi 8n and not .nr pi 8 which would set the paragraph indent to eight basic units, or about 0.02 inch. Default param- eter values are given in brackets in the remainder of this document. Registers and strings of the form $x may be used in expressions but should not be changed. Macros of the form $x perform some function (as described) and may be redefined to change this function. This may be a sensitive operation; look at the body of the original macro before changing it. All names in —me follow a rigid naming convention. The user may define number regis- ters, strings, and macros, provided that s/he uses single character upper case names or double character names consisting of letters and digits, with at least one upper case letter. should special characters be used in user-defined names. In no case | On daisy wheel type printers in twelve pitch, the —rx1 flag can be stated to make lines default to one eighth inch (the normal spacing for a newline in twelve-pitch). This is normally too small for easy readability, so the default is to space one sixth inch. TNROFF and TROFF are Trademarks of Bell Laboratories. 5-40 -me Reference Manual 1. Paragraphing These macros are used to begin paragraphs. The standard paragraph macro is .pp; the others are all variants to be used for special purposes. The first call to one of the paragraphing macros defined in this section or the .sh macro (defined in the next session) initializes the macro processor. After initialization it is not pos- sible to use any of the following requests: .s¢, .lo, .th, or .ac. Also, the effects of changing parameters which will have a global effect on the format of the page (notably page length and header and footer margins) are not well defined and should be avoided. dp Begin left-justified paragraph. Centering and underlining are turned off if they were on, the font is set to \n(pf [1] the type size is set to \n(pp [10p], and a \n(ps space is inserted before the paragraph [0.35v in TROFF, 1v or 0.5v in NROFF depending on device resolution]. The indent is reset to \n($i [0] plus \n(po [0] unless the paragraph is inside a display. (see .ba). At least the first two lines of the para- graph are kept together on a page. pPp Like .lp, except that it puts \n(pi [6n] units of indent. This is the standard paragraph macro. ip T I Indented paragraph with hanging tag. The body of the following para- graph is indented I spaces (or \n(ii [5n] spaces if I is not specified) more than a non-indented paragraph (such as with .pp) is. T is exdented (opposite of indented). The title The result is a paragraph with an even left edge and T printed in the margin. Any spaces in T must be unpaddable. new line. Ap If T will not fit in the space provided, .ip will start a | A variant of .ip which numbers paragraphs. Numbering is reset after a .Ip, .pp, or .sh. The current paragraph number is in \n($p. 2. Section Headings Numbered sections are similiar to paragraphs except that a section number is automatically generated for each one. The section numbers are of the form 1.2.3. The depth of the section is the count of numbers (separated by decimal points) in the section number. Unnumbered section headings are similar, except that no number is attached to the heading. Ssh+NTabcdef Beginnumbered section of depth N. If N is missing the current depth (maintained in the number register \n($0) is used. The values of the individual parts of the section number are maintained in \n($1 through \n($6. There is a \n(ss [1v] space before the section. printed as a section title in font \n(sf [8] and size \n(sp [10p]. T is The “name” of the section may be accessed via \¥*($n. If \n(si is non-zero, the base indent is set to \n(si times the section depth, and the section title is exdented. (See .ba.) Also, an additional indent of \n(so [0] is added to the section title (but not to the body of the section). The font is then set to the paragraph font, so that more information may occur on the line with the section number and title. .sh insures that there is enough room to print the section head plus the beginning of a paragraph (about 3 lines total). If a through f are specified, the sec- tion number is set to that number rather than incremented automati- cally. If any of a through f are a hyphen that number is not reset. If T is a single underscore (* ”’) then the section depth and numbering is reset, but the base indent is not reset and nothing is printed out. This -me Reference Manual 5-41 is useful to automatically coordinate section numbers with chapter numbers. Ssx +N | Go to section depth N [—1], but do not print the number and title, and do not increment the section number at level N. This has the effect of starting a new paragraph at level . auh T Unnumbered section heading. The title T is printed with the same rules for spacing, font, etc., as for .sh. SpTBN Print section heading. May be redefined to get fancier headings. T is the title passed on the .sh or .uh line; B is the section number for this section, and N is the depth of this section. These parameters are not always present; in particular, .sh passes all three, .uh passes only the first, and .sx passes three, but the first two are null strings. Care should be taken if this macro is redefined; it is quite complex and sub- tle. S0 TBN ‘ This macro is called automatically after every call to .$p. It is nor- mally undefined, but may be used to automatically put every section title into the table of contents or for some similiar function. 7T is the section title for the section title which was just printed, B is the section number, and N is the section depth. $1 - .66 Traps called just before printing that depth section. May be defined to (for example) give variable spacing before sections. These macros are called from .$p, so if you redefine that macro you may lose this feature. 3. Headers and Footers Headers and footers are put at the top and bottom of every page automatically. are set in font \n(tf [3] and size \n(tp [10p]. page. They Each of the definitions apply as of the next Three-part titles must be quoted if there are two blanks adjacent anywhere in the title or more than eight blanks total. The spacing of headers and footers are controlled by three number registers. \n(hm [4v] is the distance from the top of the page to the top of the header, \n(fm [3v] is the distance from the bottom of the page to the bottom of the footer, \n(tm [7v] is the distance from the top of the page to the top of the text, and \n(bm [6v] is the distance from the bottom of the page to the bottom of the text (nominal). The macros .m1, .m2, .m3, and .m4 are also sup- plied for compatibility with ROFF documents. Je U'm’r’ Define three-part header, to be printed on the top of every page. o U'm’r’ Define footer, to be printed at the bottom of every page. eh I'm’r Define header, to be printed at the top of every even-numbered page. .oh I'm’r’ Define header, to be printed at the top of every odd-numbered page. ef'l'm’r’ Define footer, to be printed at the bottom of every even-numbered page. of I'm’r’ Define footer, to be printed at the bottom of every odd-numbered page. hx Suppress headers and footers on the next page. ml +N Set the space between the top of the page and the header [4v]. m2 +N Set the space between the header and the first line of text [2v]. 5-42 -me Reference Manual m3 +N Set the space between the bottom of the text and the footer [2v]. .m4 +N Set the space between the footer and the bottom of the page [4v]. .ep End this page, but do not begin the next page. Useful for forcing out footnotes, but other than that hardly every used. Must be followed by a .bp or the end of input. $h Called at every page to print the header. May be redefined to provide fancy (e.g., multi-line) headers, but doing so loses the function of the .he, .fo, .eh, .oh, .ef, and .of requests, as well as the chapter-style title feature of .+c. $f Print footer; same comments apply as in .$h. SH A normally undefined macro which is called at the top of each page (after outputing the header, initial saved floating keeps, etc.); in other words, this macro is called immediately before printing text on a page. It can be used for column headings and the like. 4. Displays All displays except centered blocks and block quotes are preceeded and followed by an extra \n(bs [same as \n(ps] space. Quote spacing is stored in a separate register; centered blocks have no default initial or trailing space. The vertical spacing of all displays except quotes and centered blocks is stored in register \n($R instead of \n($r. dmf Begin list. Lists are single spaced, unfilled text. If f is F, the list will be filled. If m [I] is I the list is indented by \n(bi [4n]; if M the list is indented to the left margin; if L. the list is left justified with respect to the text (different from M only if the base indent (stored in \n($i and set with .ba) is not zero); and if C the list is centered on a line-by-line basis. The list is set in font \n(df [0]. Must be matched by a .)l. This macro is almost like .(b except that no attempt is made to keep the display on one page. 9] End list. Aq Begin major quote. ' These are single spaced, filled, moved in from the text on both sides by \n(qi [4n], preceeded and followed by \n(gs [same as \n(bs] space, and are set in point size \n(qp [one point smaller than surrounding text]. Jq End major quote. b mf Begin block. Blocks are a form of keep, where the text of a keep is kept together on one page if possible (keeps are useful for tables and figures which should not be broken over a page). If the block will not fit on the current page a new page is begun, unless that would leave more than \n(bt [0] white space at the bottom of the text. If\n(bt is zero, the threshold feature is turned off. is F, when they are filled. Blocks are not filled unless f The block will be left-justified if m is L, indented by \n(bi [4n] if m is I or absent, centered (line-for-line) if m is C, and left justified to the margin (not to the base indent) if m is M. The block is set in font \n(df [0]. )b End block. (zmf Begin floating keep. Like .(b except that the keep is floated to the bottom of the page or the top of the next page. Therefore, its position relative to the text changes. The floating keep is preceeded and followed by \n(zs [1v] space. Also, it defaults to mode M. -me Reference Manual 5-43 )Z End floating keep. ¢ Begin centered block. The next keep is centered as a block, rather than on a line-by-line basis as with .(b C. This call may be nested inside keeps. Je | End centered block. 5. Annotations Ad Begin delayed text. Everything in the next keep is saved for output later with .pd, in a manner similar to footnotes. Jd n End delayed text. The delayed text number register \n($d and the associated string \*# are incremented if \*# has been referenced. .pd Print delayed text. Everything diverted via .(d is printed and trun- cated. This might be used at the end of each chapter. f Begin footnote. The text of the footnote is floated to the bottom of the page and set in font \n(ff [1] and size \n(fp [8p]. Each entry is preceeded by \n(fs [0.2v] space, is indented \n(fi [3n] on the first line, and is indented \n(fu [0] from the right margin. underneath two columned output. Footnotes line up If the text of the footnote will not all fit on one page it will be carried over to the next page. Jf n End footnote. The number register \n($f and the associated string \** are incremented if they have been referenced. The macro to output the footnote seperator. This macro may be redefined types to give other size lines or other of separators. Currently it draws a 1.51 line. AX x Begin index entry. Index entries are saved in the index x [x] until called up with .xp. Each entry is preceeded by a \n(xs [0.2v] space. Each entry is “undented” by \n(xu [0.5i]; this register tells how far the page number extends into the right margin. P A JX End index entry. The index entry is finished with a row of dots with A [null] right justified on the last line (such as for an author’s name), followed by P [\n%]. If A is specified, P must be specified; \n% can be used to print the current page number. If P is an underscore, no page number and no row of dots are printed. XP X Print index x [X]. The index is formated in the font, size, and so forth in effect at the time it is printed, rather than at the time it is col- lected. 6. Columned Output 2¢c +S N Enter two-column mode. The column separation is set to +S [4n, 0.51 in ACM mode] (saved in \n($s). The column width, calculated to fill the single column line length with both columns, is stored in \n($l. The current column is in \n($c. You can test register \n($m [1] to see if you are in single column or double column mode. Actually, the request enters N [2] columned output. dc¢ Revert to single-column mode. .bc Begin column. This is like .bp except that it begins a new column on a new page only if necessary, rather than forcing a whole new page if there is another column left on the current page. 5-44 -me Reference Manual 7. Fonts and Sizes .Sz +P The pointsize is set to P [10p], and the line spacing is set proportion- ally. The ratio of line spacing to pointsize is stored in \n($r. The ratio used internally by displays and annotations is stored in \n($R (although this is not used by .sz). T WX Set W in roman font, appending X in the previous font. To append different font requests, use X = \e. If no parameters, change to roman font. i WX Set W in italics, appending X in the previous font. If no parameters, change to italic font. Underlines in NROFF. b WX Set W in bold font and append X in the previous font. ters, switch to bold font. b W X If no parame- In NROFF, underlines. Set W in bold font and append X in the previous font. ters, switch to bold font. If no parame- .rb differs from .b in that .rb does not underline in NROFF. awx Underline W and append X. This is a true underlining, as opposed to the .ul request, which changes to “underline font” (usually italics in TROFF). It won’t work right if W is spread or broken (including hyphenated). q WX In other words, it is safe in nofill mode only. Quote W and append X. In NROFF this just surrounds W with double quote marks (‘”’), but in TROFF uses directed quotes. bi WX Set W in bold italics and append X. overstrikes once. Actually, sets W in italic and Underlines in NROFF. It won’t work right if W is spread or broken (including hyphenated). In other words, it is safe in nofill mode only. Jox WX Underlines in NROFF. It won’t work right if W is spread or broken (including hyphenated). Sets W in a box, with X appended. In other words, it is safe in nofill mode only. 8. Roff Support Ax +N Indent, no break. bl N Leave N contiguous white space, on the next page if not enough room on this page. Equivalent to in N. Equivalent to a .sp N inside a block. .pa +N Equivalent to .bp. 8 o) Set page number in roman numerals. Equivalent to .af % 1. .ar Set page number in arabic. nl Number lines in margin from one on each page. n2 N Number lines from N, stop if N = 0. sk Leave the next output page blank, except for headers and footers. Equivalent to .af % 1. This is used to leave space for a full-page diagram which is produced externally and pasted in later. To get a partial-page paste-in display, say .sv N, where N is the amount of space to leave; this space will be output immediately if there is room, and will otherwise be output at the top of the next page. However, be warned: if N is greater than the amount of available space on an empty page, no space will ever be output. -me Reference Manual 5-45 9. Preprocessor Support EQmT The equation is centered if m is C or omitted, Begin equation. indented \n(bi [4n] if m is I, and left justified if m is L. T is a title printed on the right margin next to the equation. See Typesetting Mathematics — User’s Guide by Brian W. Kernighan and Lorinda L. Cherry. EN ¢ End equation. If ¢ is C the equation must be continued by immedi- ately following with another .EQ, the text of which can be centered along with this one. Otherwise, the equation is printed, always on one page, with \n(es [0.5v in TROFF, 1v in NROFF] space above and below it. | TS h Table start. Tables are single spaced and kept on one page if possible. JTH With .TS H, ends the header portion of the table. TE Table end. Note that this table does not float, in fact, it i1s not even If you have a large table which will not fit on one page, use h = H and follow the header part (to be printed on every page of the table) with a .TH. See Tbl — A Program to Format Tables by M. E. Lesk. guaranteed to stay on one page if you use requests such as .sp intermixed with the text of the table. If you want it to float (or if you use requests inside the table), surround the entire table (including the .TS and .TE requests) with the requests .(z and .)z. 10. Miscellaneous re Reset tabs. Set to every 0.5i in TROFF and every 0.81 in NROFF. .ba +N Set the base indent to +N [0] (saved in \n($i). All paragraphs, sec- tions, and displays come out indented by this amount. footnotes are unaffected. Titles and The .sh request performs a .ba request if \n(si [0] is not zero, and sets the base indent to \n(si®\n($0. x1 +N Set the line length to N [6.0i]. This differs from .1l because it only affects the current environment. 1 +N Set line length in all environments to N [6.0i]. This should not be .hl output. The current line length is stored in \n($L Draws a horizontal line the length of the page. This is useful inside Jo This macro loads another set of macros (in /usr/lib/me/local.me) used after output has begun, and particularly not in two-columned floating keeps to differentiate between the text and the figure. which is intended to be a set of locally defined macros. These macros should all be of the form .*X, where X is any letter (upper or lower case) or digit. 11. Ap Standard Papers Begin title page. Spacing at the top of the page can occur, and headers and footers are supressed. Also, the page number is not incremented for this page. .th Set thesis mode. This defines the modes acceptable for a doctoral dissertation at Berkeley. It double spaces, defines the header to be a single page number, and changes the margins to be 1.5 inch on the left and one inch on the top. .++ and .+c¢ should be used with it. This macro must be stated before initialization, that is, before the first call 5-46 -me Reference Manual of a paragraphing macro or .sh. A+ mH This request defines the section of the paper which we are entering. The section type is defined by m. C means that we are entering the chapter portion of the paper, A means that we are entering the appendix portion of the paper, P means that the material following should be the preliminary portion (abstract, table of contents, etc.) portion of the paper, AB means that we are entering the abstract (numbered independently from 1 in Arabic numerals), and B means that we are entering the bibliographic portion at the end of the paper. Also, the variants RC and RA are allowed, which specify renumbering of pages from one at the beginning of each chapter or appendix, respectively. The H parameter defines the new header. the entire header must be quoted. If there are any spaces in it, If you want the header to have the chapter number in it, Use the string \\\\n(ch. For example, to number appendixes A.1 etc, type .++ RA “\\\\n(ch.%". Each section (chapter, appendix, etc.) should be preceeded by the .+e¢ request. It should be mentioned that it is easier when using TROFF to put the front material at the end of the paper, so that the table of contents can be collected and output; this material can then be physically moved to the beginning of the paper. Ac T Begin chapter with title T. \n(ch. The chapter number is maintained in This register is incremented every time .+c is called with a parameter. The title and chapter number are printed by .$¢. header is moved to the footer on the first page of each chapter. The If T is omitted, .$¢ is not called; this is useful for doing your own “title page” at the beginning of papers without a title page proper. .$c calls .$C as a hook so that chapter titles can be inserted into a table of contents automatically. The footnote numbering is reset to one. $e T Print chapter number (from \n(ch) and 7. This macro can be redefined to your liking. It is defined by default to be acceptable for a PhD thesis at Berkeley. This macro calls $C, which can be defined to make index entries, or whatever. SCKNT This macro is called by .$c. It is normally undefined, but can be used to automatically insert index entries, or whatever. K is a keyword, either “Chapter” or “Appendix” (depending on the .++ mode); N is the chapter or appendix number, and T is the chapter or appendix title. ac AN This macro (short for .acm) sets up the NROFF environment for photo-ready papers as used by the ACM. and has no headers or footers. This format is 25% larger, The author’s name A is printed at the bottom of the page (but off the part which will be printed in the conference proceedings), together with the current page number and the total number of pages N. Additionally, this macro loads the file /usr/lib/me/acm.me, which may later be augmented with other macros useful for printing papers for ACM conferences. It should be noted that this macro will not work correctly in TROFF, since it sets the page length wider than the physical width of the phototypesetter roll. -me Reference Manual 5-47 12. Predefined Strings Footnote number, actually \*[\n($f\*]. ek This macro is incremented after each call to .)f. \4 Delayed text number. Actually [\n($d]. Superscript. | This string gives upward movement and a change to a smaller point size if possible, otherwise it gives the left bracket character (‘[’). Extra space is left above the line to allow room for the super- V] script. | Unsuperscript. Inverse to \¥[. For example, to produce a superscript you might type x\¥[2\*], which will produce x. Subscript. Defaults to ‘<’ if half-carriage motion not possible. Extra space is left below the line to allow for the subscript. \¥> Inverse to \*<. \(dw The day of the week, as a word. V¥(mo The month, as a word. \*(td Today’s date, directly printable. The date is of the form April 8, 1984. Other forms of the date can be used by using \n(dy (the day of the month; for example, 8), \*(mo (as noted above) or \n(mo (the same, but as an ordinal number; for example, April is 4), and \n(yr (the last two digits of the current year). \*(1q Left quote marks. V¥ (rq Right quote. \*..... % em dash in TROFF; two hyphens in NROFF. 13. Double quote in NROFF. Special Characters and Marks There are a number of special characters and diacritical marks (such as accents) avail- able through —me. To reference these characters, you must call the macro .sc to define the characters before using them. Define special characters and diacritical marks, as described in the SC remainder of this section. This macro must be stated before initializa- tion. The special characters available are listed below. Tilde Caret Cedilla Czech Circle There exists For all Example a\*’ e\*" \*\*» \*, \*y n\*" e\*” c\¥, eVtv \*: \* \*(ge \*(qa s O (Grave accent Umlat Usage \** \* u\*: A\*o l>0m<‘0 M®> = Acute accent /{ Name 5-48 -me Reference Manual Acknowledgments I would like to thank Bob Epstein, Bill Joy, and Larry Rowe for having the courage to use the —me macros to produce non-trivial papers during the development stages; Ricki Blau, Pamela Humphrey, and Jim Joyce for their help with the documentation phase; and the plethora of people who have contributed ideas and have given support for the project. Nroff/Troff Users Manual 5-49 NROFF/TROFF User’s Manual Joseph F. Ossanna Bell Laboratories Murray Hill, New Jersey 07974 Introduction NROFF and TROFF are text processors under the PDP-11 UNIX Time-Sharing System! that format text for typewriter-like terminals and for a Graphic Systems phototypesetter, respectively. They accept lines of text interspersed with lines of format control information and format the text into a printable, paginated document having a user-designed style. NROFF and TROFF offer unusual freedom in document styling, including: arbitrary style headers and footers; arbitrary style footnotes; multiple automatic sequence numbering for paragraphs, sections, etc; multiple column output; dynamic font and point-size control; arbitrary horizontal and vertical local motions at any point; and a family of automatic overstriking, bracket construction, and line drawing functions. NROFF -and TROFF are highly compatible with each other and it is almost always possible to prepare input acceptable to both. Conditional input is provided that enables the user to embed input expressly destined for either program. NROFF can prepare output directly for a variety of terminal types and is capable of utilizing the full resolution of each terminal. Usage The general form of invoking NROFF (or TROFF) at UNIX command level is nroff options files (or troff options files) where oprions represents any of a number of option arguments and files represents the list of files containing the document to be formatted. An argument consisting of a single minus (=) is taken to be a file name corresponding to the standard input. If no file names are given input is taken from the standard input. The options, which may appear in any order so long as they appear before the files, are: Option = g list Effect Print only pages whose page numbers appear in lisz, which consists of commaseparated numbers and number ranges. A number range has the form N—M and means pages N through M; a initial —/N means from the beginning to page N, and a final —=nN —g N N— means from N to the end. ‘Number first generated page N. Stop every N pages. NROFF will halt prior to every N pages (default N=1) to allow paper loading or changing, and will resume upon receipt of a newline. TROFF will stop the phototypesetter every N pages, produce a trailer to allow changing cassettes, and will resume after the phototypesetter START button is pressed. —~mname Prepends the macro file /usr/lib/tmac.name to the input files. —ralN Register a (one-character) is set to N. | Read standard input after the input files are exhausted. | Invoke the simultaneous input-output mode of the rd request. 5-50 Nroff/Troff Users Manual NROFF Only —Tname Specifies the name of the output terminal type. Currently defined names are 37 for the (defauit) Model 37 Teletype?®, tn300 for the GE TermiNet 300 (or any terminal without half-line capabilities), 300S for the DASI-300S, 300 for the DASI- 300, and 450 for the DASI-450 (Diablo Hyterm). - Produce equally-spaced words in adjusted lines, using full terminal resolution. TROFF Only -t Direct output to the standard output instead of the phototypesetter. el { Refrain from feeding out paper and stopping phototypesetter at the end of the run. . | Wait until phototypesetter is available, if currently busy. -=b TROFF will report whether the phototypesetter is busy or available. cessing is done. No text pro- - Send a printable (ASCII) approximation of the results to the standard output. -pN Print all characters in point size N while retaining all prescribed spacings and motions, to reduce phototypesetter eiasped time. -g Prepare output for the Murray Hill Computation Center phototypesetter and direct it to the standard output. Each option is invoked as a separate argument; for example, | nroff —04,8—10 =T300S —mabc filel file2 requests formatting of pages 4, 8, 9, and 10 of a document contained in the files named file] and file2, specifies the output terminal as a DASI-300S, and invokes the macro package abc. Various pre- and post-processors are available for use with NROFF and TROFF. These include the equation preprocessors NEQN and EQN? (for NROFF and TROFF respectively), and the tableconstruction preprocesser TBL3. A reverse-line postprocessor COL4 is available for multiple-column NROFF output on terminals without reverse-line ability; COL expects the Model 37 Teletype escape sequences that NROFF produces by default. TK* is a 37 Teletype simulator postprocessor for printing NROFF output on a Tektronix 4014, TCAT? is phototypesetter-simulator postprocessor for TROFF that produces an approximation of phototypesetter output on . Tektronix 4014. For example, in thl files | eqn | troff =t options | tcat the first | indicates the piping of TBL’s output to EQN’s input; the second the piping of EQN’s output to TROFF’s input; and the third indicates the piping of TROFF's output to TCAT. send TROFF (=—g) output to the Murray Hill Computation Center. GCAT* can be used to The remainder of this manual consists of: a Summary and Index; a Reference Manual keyed to the index; and a set of Tutorial Examples. Another tutorial is [5]. Joseph F. Ossanna References [1] K. Thompson, D. M. Ritchie, UNIX Programmer’s Manual, Sixth Edition (May 1975). (2] B. W. Kernighan, L. L. Cherry, Typesetting Mathematics — User's Guide (Second Edition), Bell Laboratories internal memorandum. [3] M. E. Lesk, TB{ — A4 Program to Formar Tables, Bell Laboratories internal memorandum. [4] Internal on-line documentation, on UNIX. (S] B. W. Kernighan, 4 TROFF Tutorial, Beil Laboratories internal memorandum. Nroff/Troff Users Manual 5-51 SUMMARY AND INDEX Request Form Initial Value® If No Argument Notes# Explanation 1. General Explanation .ps =N ss N cs FNM bd FN bdS FN gt F fp N F 10 point 12/36em off off off Roman R,I,B,S previous ignored previous ignored Mmoo temim 2. Font and Character Size Control Point size; also \s+ V.t Space-character size set to N/36em.t Constant character space (width) mode (font F).t Embolden font F by N—1 units.t Embolden Special Font when current font is F.f \f.N. Change to font F = x, xx, or 1-4. Also \fx, \f(xx, Font named F mounted on physical position 1 < N<4. 3. Page Control pl =N bp =N pn =N 1lin Nem| 11in Nem] ignored \j Page length. Bty Eject current page; next page number N. Next page number N. Page offset. .po =N 0; 26/27 in previous \ .ne N - D,v mnk R none gt =N none N1V internal internal D D,y Need N vertical space ( V' = vertical spacing). Mark current vertical place in register R. Return (upward only) to marked vertical place. 4. Text Filling, Adjusting, and Centering br - B Break. i .nf fill fill B.E B.,E No filling or adjusting of output lines. .ad ¢ .na adj,both adjust adjust E Adjust output lines with mode c. No output line adjusting. ce N off Na=| E B,E Fill output lines. Center following N input text lines. 5. Vertical Spacing Vertical base line spacing (V). Output N—1 Vs after each text output line. Space vertical distance N in either direction. v§ N 1/6in;12pts previous ds N Sp N Na=] - Sv N - .08 - Output saved vertical distance. DS space Turn no-space mode on. IS - Restore spacing; turn no-space mode off. 6. 68 Ne=lV Ne=] V Save vertical distance N. Line Length and Indenting Al =N dn =N obd previous e AS o dV 6.5in N () - E.m Line length. previous B,E.m Indent. ignored B.Em T o previous i€mpora U 1 7. Macros, Strings, Diversion, and Position Traps de xyy - Jyy= Define or redefine macro xx; end at call of yy. Am XX yy - yy==,, Append to a macro. .ds xx string - ignored a8 Xx string - ignored Define a string xx containing string. Append string to string xx. *Values separated by ";" are for NROFF and TROFF respectively. #Notes are explained at the end of this Summary and Index tNo effect in NROFF. tThe use of ° ° " as controi character (instead of ".") suppresses the break function. 5-52 Nroff/Troff Users Manual Request Initial Form Value If No Argument JIm XX - ignored - Remove request, macro, or string. I XX Py - ignored - Rename request, macro, or string xx to yy. di xx . end D Divert output to macro xx. .da xx - end D Divert and append to xx. wh Vxx - . v Set location trap; negative is w.r.t. page bottom. Notes Explanation .ch x N - - Y Change trap location. dt N xx dt NV xx - off off D.v E Set a diversion trap. .em xx none none - End macro is xx. 8. Set an input-line count trap. Number Registers nr R=NM - u Define and set number register R; auto-increment by M. .af R ¢ arabic - - I R - - « Assign format to register R (c=1,1i, I, a, A). Remove register R. 9. Tabs, Leaders, and Fields ta Nt ... (s 0.8: 0.5in none R none E,m Tab settings; lef? type, unless t=R (right), C(centered). none E Tab repetition character. Jdec . none E Leader repetition character. feabd off off - Set field delimiter a and pad character b. 10. Input and Output Conventions and Character Translations &C C .80 \\ \ - Set escape character. on - - Turn off escape character mechanism. Jdg N -;on on - Ligature mode on if N>0. aul N cu N off off Ne= Ne= E E Continuous underline in NROFF; like ul in TROFF. uf F Italic [talic - Underline font set to F (to be switched to by ul). .ce ¢ . . E Set control character to c. F Yy &-Y- 4 Underline (italicize in TROFF) N input lines. .c2 ¢ ’ ’ E Set nobreak control character to ¢ tr abed none - 0O Translate a to b, etc. on output. 11. Local Horizontal and Vertical Motions, and the Width Function 12. Overstrike, Bracket, Line-drawing, and Zero-width Functions 13. Hyphenation. .nh hyphenate - E hy N hyphenate hyphenate E Hyphenate; N = mode. e ¢ \% \% E Hyphenation indicator character c. ignored - Exception words. Three part title. Page number character. w wordl ... 14. tl “left’ center’ right’ - - % Jt =N 6.5in off previous E.m Length of title. Output Line Numbering. am =NMS ] - .an N 16. _ Three Part Titles. .pc ¢ 15. No hyphenation. off N | E E Number mode on or off, set parameters. Do not number next N lines. Conditional Acceptance of Input if ¢ anything If condition ¢ true, accept anything as input, for multi-line use \{anvthing\). Nroff/Troff Users Manual 5-53 If No Initial Value Request Form Notes Explanation u Argument Af 'c anything - Af N anything . If condition c false, accept anything. Af ' N anything Af “stringl’string2’ anything Af ! stringl’ string2’ anything de ¢ anything - u .el anything - If expression N > 0, accept anything. If expression N < 0, accept anything. If stringl identical to string2, accept anything. If stringl not identical to string2, accept anything. If portion of if-else; all above forms (like if). Else portion of if-else. . Environment switched (push down). u - - 17. Environment Switching. .ev N N=() previous 18. Insertions from the Standard Input .rd prompt - prompt =BEL - 2x . - Read insertion. . Exit from NROFF/TROFF. 19. Input/Output File Switching .S0 filename . . .nx filename end-of-file - .pl program - . Switch source file (push down). Next file. Pipe output to program (NROFF only). 20. Miscellaneous mechN - off E.m Set margin character ¢ and separation . Lm string - newline . . - yy=., .pm ¢ - all . Print string on terminal (UNIX standard message output). Ignore till call of yy. Print macro names and sizes; 1 . - B if ¢ present, print only total of sizes. Flush output buffer. Ag yy 21. Output and Error Messages NotesB Request normally causes a break. D Mode or relevant parameters associated with current diversion level. E O Relevant parameters are a part of the current environment. Must stay in effect until logical output. P Mode must be still or again in effect at the time of physical output. v,p.m,u Default scale indicator; if not specified, scale indicators are ignored. Alphabetical Request and Section Number Cross Reference ad 4 cc 10 ds 7 fc 9 ie 16 n 6 nh 13 pi 19 tm 7 ta 9 vs § af 8 ce 4 dt 7 i 4 if 16 ls § nml$ pi 3 rr 8 tc 9 wh 7 am 7 ch 7 ec 10 1 20 ig 20 It 14 nn 1§ pm 20 re § 6 as 7 cs 2 el 16 fp 2 in 6 mc 20 nr 8§ pn 3 rt 3 u bd 2 cu 10 em 7 ft 2 it 7 mk 3 ng $§ po 3 so 19 tm 20 14 bp 3 da 7 eo 10 he 13 le 9 na 4 nx 19 ps 2 sp S tr 10 br 4 de 7 ev 17 hw 13 lg 10 ne 3 os S rd 18 ss 2 uf 10 c2 10 di 7 ex 18 hy 13 i 10 nf 4 pc 14 rm 7 sv § ul 10 5-54 Nroff/Troff Users Manual Escape Sequences for Characters, Indicators, and Functions Section Escape Reference Sequence Meaning \\ \ (to prevent or delay the interpretation of \) \' " (acute accent); equivalent to \(aa " (grave accent); equivalent to \(ga 10.1 SR e SN € SR [ \e [ \" — Minus sign in the current font \. Period (dot) (see de) el el [ L3 [ 9 \| \* [, PN \ - \(space) \0 10.6 10.7 7.3 Printable version of the current escape character. \& \! \n \SN Unpaddable space-size space character Digit width space 1/6 em narrow space character (zero width in NROFF) 1/12 em half-narrow space character (zero width in NROFF) Non-printing, zero width character Transparent line indicator Beginning of comment Interpolate argument 1 £.¥<9 13 \% Default optional hyphenation character 2.1 \(xx \ex, \s(xx Character named xx 7.1 9.1 12.3 4.2 11.1 2.2 \a \b“abe...” \¢ [nterpolate string x or xx Non-interpreted leader character Bracket building function Interrupt text processing Forward (down) 1/2em vertical motion (1/2 line in NROFF) \d \f NV Change to font named x or xx, or position N \f \xQe 12.1 \h’' V"’ \kx \l“Ne’ \L'NC' \nx,\n (e \o'abc...” Overstrike characters a, b, ¢, ... 4.1 \p Break and spread output line 11.1 11.3 12.4 12.4 11.1 \r \sN\s=N \t \u 11.1 \'v'N’ 11.2 \w string’ 5.2 \ X'V’ \z¢ 11.1 2.3 9.1 12.2 16 16 10.7 \{ \}) \(newline) \X Local horizontal motion; move right NV (negative lef) Mark horizontal input place in register x Horizontal line drawing function (optionally with ¢) Vertical line drawing function (optionally with ¢) Interpolate number register x or xx Reverse | em vertical motion (reverse line in NROFF) Point-size change function Non-interpreted horizontal tab Reverse (up) 1/2em vertical motion (1/2 line in NROFF) Local vertical motion; move down N (negative up) Interpolate width of string Extra line-space function (negative before, positive after) Print ¢ with zero width (without spacing) Begin conditional input End conditional input Concealed (ignored) newline X, any character not listed above The escape sequences \\, \.. \", \S$, \«, \a, \n. \t. and \(newline) are interpreted in copy mode (§7.2). Nroff/Troff Users Manual 5-55 Predefined General Number Registers Register Name Description 3 % Current page number. 11.2 7.4 7.4 ct Section Reference dl dn Height (vertical size) of last completed diversion. Current day of the week (1-7). dw 11.3 Character type (set by width function). Width (maximum) of last completed diversion. dy hp Current day of the month (1-31). Current horizontal place on input line. In Output line number. mo Current month (1-12). 4.1 nl Vertical position of last printed text base-line. 11.2 sb 11.2 st 15 yr Depth of string below base line (generated by widrh function). Height of string above base line (generated by width function). Last two digits of current year. Predefined Read-Only Number Registers 7.3 11.1 11.1 5.2 ] S & =~ (s = b= L Lo [9] 7.4 Register Name N<hgeiErmbbbobbhannmadins o Section Reference Description Number of arguments available at the current macro level. Set to 1 in TROFF, if —a option used; always 1 in NROFF. Available horizontal resolution in basic units. Set to 1 in NROFF, if =T option used; always 0 in TROFF. Available vertical resolution in basic units. Post-line extra line-space most recently utilized using \x’ N Number of lines read from current input file. Current vertical place in current diversion; equal to nl, if no diversion. Current font as physical quadrant (1-4). Text base-line high-water mark on current page or diversion. Current indent. Current line length. Length of text portion on previous output hne Current page offset. Current page length. Current point size. Distance to the next trap. Equal to 1 in fill mode and 0 in nofill mode. Current vertical line spacing. Width of previous character. Reserved version-dependent register. Reserved version-dependent register. Name of current diversion. 5-56 Nroff/Troff Users Manual REFERENCE MANUAL 1. General Explanation 1.1. Form of input. Input consists of text lines, which are destined to be printed, interspersed with control lines, which set parameters or otherwise control subsequent processing. Control lines begin with a con- trol character—normaily . (period) or ° (acute accent) —followed by a one or two character name that specifies a basic request or the substitution of a user-defined macro in place of the control line. The control character * suppresses the break function—the forced output of a partially filled line —caused by The control character may be separated from the request/macro name by white space certain requests. (spaces and/or tabs) for esthetic reasons. Names must be followed by either space or newline. Control lines with unrecognized names are ignored. Various special functions may be introduced anywhere in the input by means of an escape character, normally \. For exampile, the function \nR causes the interpolation of the contents of the number regis- ter R in place of the function; here R is either a single character name as in \nx, or left-parenthesisintroduced, two-character name as in \n (e 1.2. Formauter and device resolution. TROFF internally uses 432 units/inch, corresponding to the Graphic Systems phototypesetter which has a horizontal resolution of 1/432 inch and a vertical resolution of 1/144 inch. NROFF internally uses 240 units/inch, corresponding to the least common multiple of the horizontal and vertical resolutions of various typewriter-like output devices. TROFF rounds horizontal/vertical numerical parameter input to the actual horizontal/vertical resolution of the Graphic Systems typesetter. NROFF similarly rounds numerical input to the actual resolution of the output device indicated by the —T option (default Model 37 Teletype). 1.3. Numerical parameter input. Both NROFF and TROFF accept numerical input with the appended scale indicators shown in the following table, where Sis the current type size in points, Vis the current vertical line spacing in basic units, and Cis a nominal character width in basic units. Scale Number of basic units Indicator Meaning TROFF NROFF i Inch ¢ Centimeter Pica = 1/6 inch 72 240/6 m Em = § points n En = Em/2 6xS IxS C C, same as Em 432 240 432x50/127 | 240x50/127 p Point = 1/72 inch | 6 240/72 u v Basic unit 1 1 Vertical line space Vv 4 none Default, see below In NROFF, both the em and the en are taken to be equal to the C, which is output-device dependent; common values are 1/10 and 1/12 inch. Actual character widths in NROFF need not be all the same and constructed characters such as —> (—) are often extra wide. The default scaling is ems for the horizontally-oriented requests and functions 1l, in, ti, ta, It, po, me, \h, and \I: ¥s for the verticallyoriented requests and functions pl, wh, ch, dt, sp, sv, ne, rt, \v, \x, and \L; p for the vs request; and u for the requests nr, if, and ie. A4/l other requests ignore any scale indicators. When a number regis- ter containing an already appropriately scaled number is interpolated to provide numerical input, the unit scale indicator u may need to be appended to prevent an additional inappropriate default scaling. Nroff/Troff Users Manual 5-57 The number, N, may be specified in decimal-fraction form but the parameter finally stored is rounded to an integer number of basic units. The absolute position indicator | may be prepended to a number N to generate the distance to the vertical or horizontal place N. For vertically-oriented requests and functions, | ¥ becomes the distance in basic units from the current vertical place on the page or in a diversion (37.4) to the the vertical piace V. For all other requests and functions, | N becomes the distance from the current horizontal place on the inpur line to the horizontal place . For example, sp |3.2¢ will space in the required direction to 3.2 centimeters from the top of the page. 1.4. Numerical expressions. Wherever numerical input is expected an expression involving parentheses, the arithmetic operators +, —, /, =, % (mod), and the logical operators <, >, <= D> == (or =m=), & (and), : (or) may be used. Except where controlled by parentheses, evaluation of expressions is left-to-right; there is no operator precedence. In the case of certain requests, an initial + or — is stripped and interpreted as an increment or decrement indicator respectively. In the presence of default scaling, the desired scale indicator must be attached to every number in an expression for which the desired and default scaling differ. point size is 10, then For example, if the number register x contains 2 and the current J1 (4.2514+\nxP+3)/2u will set the line length to 1/2 the sum of 4.25 inches + 2 picas + 30 points. 1.5. Notation. Numerical parameters are indicated in this manual in two ways. =+ N means that the argument may take the forms N, +N, or =N and that the corresponding effect is to set the affected parameter to N, to increment it by N, or to decrement it by N respectively. Plain N means that an ini- tial algebraic sign is mor an increment indicator, but merely the sign of N. Generally, unreasonable numerical input is either ignored or truncated to a reasonable value. For example, most requests expect to set parameters to non-negative values; exceptions are sp, wh, ch, nr, and if. The requests ps, ft, po, vs, ls, 11, in, and It restore the previous parameter value in the absence of an argument. Single character arguments are indicated by single lower case letters and one/two character arguments Character string arguments are indicated by multi-character are indicated by a pair of lower case letters. mnemonics. 2. Font and Character Size Control 2.1. Character set. The TROFF character set consists of the Graphics Systems Commercial Il character set plus a Special Mathematical Font character set—each having 102 characters. These character sets are shown in the attached Table I. All ASCII characters are included, with some on the Special Font. With three exceptions, the ASCII characters are input as themseives, and non-ASCII characters are input in the form \(xxx where xx is a two-character name given in the attached Table II. The three ASCII exceptions are mapped as follows: ASCII Input Printed by TROFF Character Name Character Name " acute accent ’ close quote ) - grave accent minus ‘ - open quote hyphen The characters °, *, and = may be input by \’, \', and \~ respectively or by their names (Table II). The ASCII characters @, #, ", ", °, <, >, \, {, ], 7, °, and _ exist only on the Special Font and are printed as a l-em space if that Font is not mounted. NROFF understands the entire TROFF character set, but can in general print only ASCII characters, additional characters as may be available on the output device, such characters as may be able to be constructed by overstriking or other combination, and those that can reasonably be mapped into other printable characters. The exact behavior is determined by a driving table prepared for each device. The 5-58 Nroff/Troff Users Manual characters °, ', and _ print as themselves. 2.2 Fonts. The default mounted fonts are Times Roman (R), Times Italic (I), Times Boid (B), and the Special Mathematical Font (S) on physical typesetter positions 1, 2, 3, and 4 respectively. These fonts are used in this document. The current font, initially Roman, may be changed (among the mounted fonts) by use of the ft request, or by imbedding at any desired point either \fx, \f(xx, or \fN where x and xx are the name of a mounted font and N is a numerical font position. It is not necessary to change to the Special font; characters on that font are automatically handled. A request for a named but not-mounted font is ignored. TROFF can be informed that any particular font is mounted by use of the fp request. The list of known fonts is installation dependent. In the subsequent discussion of font-related requests, F represents either a one/two-character font name or the numerical font position, 1.4. The current font is available (as numerical position) in the read-only number register .f. NROFF understands font control and normally underlines Italic characters (see §10.5). 2.3. Character size. Character point sizes available on the Graphic Systems typesetter are 6, 7, 8, 9, 10, 11, 12, 14, 16, 18, 20, 22, 24, 28, and 36. This is a range of 1/12 inch to 1/2 inch. The ps request is used to change or restore the point size. Alternatively the point size may be changed between any two characters by imbedding a \s/N at the desired point to set the size to N, or a \s=/N (1<N<K9) to increment/decrement the size by N, \sO restores the previous size. Requested point size values that are between two valid sizes yield the larger of the two. The current size is available in the .s register. NROFF ignores type size control. Regquest Form Initial Value If No Argument Notes® Explanation ps N 10 point previous E Point size set to = N. Alternatively imbed \sNV or \s= /. Any positive size value may be requested; if invalid, the next larger valid size will result, with 2 maximum of 36. A paired sequence +N, —N will work because the previous requested value is also remembered. Ignored in NROFF. Ss N 12/36em ignored E Space-character size is set to N/36ems. minimum word spacing in adjusted This size is the text. Ignored in NROFF. s FNM off . P Constant character space (width) mode is set on for font F (if mounted); the width of every character will be taken to be b 4% Wy N/36 ems. 4 V) o W “whbdlwie If &8 M is absent. 4F4 bnd DBWAdWwihbbg the em is that o| |&¢ of bidWw bed GAALA the character’s point size; if M is given, the em is M- points. All affected characters are centered in this space, including those with an actual space. Font Special characters width larger than this occurring current font is F are also so treated. while the If N is absent, the mode is turned off. The mode must be still or again in effect when the characters are physically printed. Ignored in NROFF. bd F N off - P The characters in font F will be artificially emboldened by printing each one twice, separated by N—1 basic units. A reasonable value for ¥ is 3 when the character size is in the vicinity of 10 points. mode is turned off. printed with .bd I 3. If N is missing the embolden The column heads above effect when the characters are physically printed. in NROFF. *Notes are explained at the end of the Summar y and [ndex above. were The mode must be still or again in Ignored Nroff/Troff Users Manual bdS FN off - P 5-59 The characters in the Special Font will be emboldened whenever the current font is £ with .bd SB3. This manual was printed The mode must be still or again in effect when the characters are physically printed. t F Roman previous E Font changed to F. Alternatively, imbed \fF. - name P is reserved to mean the previous font. fp NF R,I,B,S ignored - The font Font position. This is a statement that a font named Fis mounted on position N (1-4). It is a fatal error if Fis not known. The phototypesetter has four fonts physically mounted. Each font consists of a film strip which can be mounted on a numbered quadrant of a wheel. The default mounting sequence assumed by TROFF is R, [, B, and S on positions 1, 2, 3 and 4. 3. Page control Top and bottom margins are not automatically provided; it is conventional to define two macros and to set traps for them at vertical positions 0 (top) and —N (N from the bottom). See §7 and Tutorial Examples §T2. A pseudo-page transition onto the first page occurs either when the first break occurs or when the first non-diverted text processing occurs. Arrangements for a trap to occur at the top of the first page must be completed before this transition. In the following, references to the current diversion (§7.4) mean that the mechanism being described works during both ordinary and diverted output (the former considered as the top diversion level). The useable page width on the Graphic Systems phototypesetter is about 7.54 inches, beginning about 1/27 inch from the left edge of the 8 inch wide, continuous roll paper. The physical limitations on NROFF output are output-device dependent. Reguest Form pl =N - Initial Value Ilf No Argument Notes Explanation llin 1lin v Page length set to = /N. The internal limitation is about 75 inches in TROFF and about 136 inches in NROFF. The current page length is available in the .p register. bp =N N ] - B*,v Begin page. is begun. The current page is ejected and a new page If £ N is given, the new page number will be += N. Also see request ns. pn =N Na= ignored » Page number. The next page (when it occurs) will have the page number = /N. A pn must occur before the ini- tial pseudo-page transition to effect the page number of the first page. The current page number is in the % register. po =N 0; 26/27int previous v Page offset. The current left margin is set to =N. The TROFF initial value provides about | inch of paper mar- gin including the physical typesetter margin of 1/27 inch. th) is + (page-offset) In TROFF the maximum (line-leng about 7.54 inches. See §6. The current page offset is available in the .o register. .ne N - N=1V D,y Need XN vertical space. If the distance, D, to the next trap position (see §7.5) is less than N, a forward vertical space of size D occurs, which will spring the trap. there are no remaining traps on the page, *The use of " ° " as control character (instead of ".") suppresses the break function. tValues separated by ;" are for NROFF and TROFF respectively. If D is the 5-60 Nroff/Troff Users Manual distance to the bottom of the page. If D< ¥V, another line could still be output and spring the trap. In a diver- sion, D is the distance to the diversion trap, if any, or is very large. .mk R none internal D Mark the current vertical place in an internal register (both associated with the current diversion level), or in register R, if given. See rt request. at =N none internal D,y Return upward only to a marked vertical place in the current diversion. If =N (w.r.t. current place) is given, the place is = .V from the top of the page or diversion or, if .V is absent, to a place marked by a previous mk. Mote that the sp request (§3.3) may be used in all cases instead of rt by spacing to the absolute place stored in a explicit register; e. g. using the sequence .mk R .sp |\n Ru. 4. Text Filling, Adjusting, and Centering 4.1, Filling and adjusting. Normally, words are collected from input text lines and assembled into a out- put text line until some word doesn’t fit. An autempt is then made the hyphenate the word in effort to assemble a part of it into the output line. The spaces between the words on the output line are then increased to spread out the line to the current lire length minus any current indent. A word is any string Any adjacent pair of words that must be kept together (neither split across output lines nor spread apart in the adjustment of characters delimited by the space character or the beginning/end of the input line. process) can be tied together by separating them with the unpaddable space character "\ " (backslashspace). The adjusted word spacings are uniform in TROFF and the minimum interword spacing can be controlled with the ss request (§2). In NROFF. they are normally nonuniform because of quantization to character-size spaces; however, the command line option —e causes uniform spacing with full output device resolution. Filling, adjustment, and hypiaenation (§13) can all be prevented or controlled. The text length on the last line output is available in the .n register, and text base-line position on the page for this line is in the nl register. The text base-line high-water mark (lowest place) on the current page is in the .h register. An input text line ending with ., ?, or ! is taken to be the end of a sentence, and an additional space character is automatically provided during filling. Multiple inter-word space characters found in the input are retained, except for trailing spaces; initial spaces also cause a break. When filling is in effect, a \p may be imbedded or attached to a word to cause a break at the end of the word and have the resuiting output line spread out to fill the current line length. A text input line that happens to begin with a control character can be made to not look like a control line by prefacing it with the non-printing, zero-width filler character \&. Still another way is to specify output translation of some convenient character into the control character using tr (§10.5). 4.2. Interrupted text. The copying of a input line in nofill (non-fill) mode can be interrupted by terminat- ing the partial line with a \e. The nexr encountzred input text line will be considered to be a continuation of the same line of input text. Similarly, a word within filled text may be interrupted by terminat- ing the word (and line) with \¢; the next encountered text will be taken as a continuation of the interrupted word. If the intervening control lines cause a break, any partial line wnll be forced out along with any partial word. Request Initial If No Form Value Argument Notes Explanation .br - - B Break. The filling of the line currently being collected is stopped and the line is output without adjustment. Text linas text beginning with space characters and linss (blank lines) also cause a break. empty Nroff/Troff Users Manual 5-61 i fill on - B.E Fill subsequent output lines. The register .u is | in fill mode and 0 in nofill mode. .nf fill on - B,E Nofill. Subsequent output lines are neither filled nror adjusted. Input text lines are copied directly to output lines without regard for the current line length. .ad ¢ adj,both adjust E Line adjustment is begun. If fill mode is not on, adjust- ment will be deferred until fill mode is back on. type indicator ¢ is present, the adjustment If the type is changed as shown in the following table. na adiust - E Indicator Adjust Type l adjust left margin only r adjust right margin only ¢ center born adjust both margins absent unchanged Noadjust. Adjustment is turned off; the right margin will be ragged. The adjustment type for ad is not changed. Output line filling still occurs if fill mode is on. ce N off Nes | B,E Center the next N input text lines within the current (line-length minus indent). is cleared. lines. §. If AN=0, any residual count A break occurs after each of the N input If the input line is too long, it will be left adjusted. Vertical Spacing 5.1. Base-line spacing. The vertical spacing (V) between the base-lines of successive output lines can be set using the vs request with a resolution of 1/144inch = 1/2 point in TROFF, and to the output device resolution in NROFF. V must be large enough to accommodate the character sizes on the affected output lines. For the common type sizes (9-12 points), usual typesetting practice is to set ¥ to 2 points greater than the point size; TROFF default is 10-point type on a 12-point spacing (as in this document). The current Vis available in the .v register. Multiple- ¥ line separation (e.g. double spacing) may be requested with ls. J.2. Extra line-space. 1f a word contains a vertically tall construct requiring the output line containing it to have extra vertical space before and/or after it, the extra-line-space function \x’ N’ can be imbedded in or attached to that word. In this and other functions having a pair of delimiters around their parame- ter (here °), the delimiter choice is arbitrary, except that it can’t look like the continuation of a number expression for N. If N is negative, the output line containing the word will be preceded by N extra vertical space; if N is positive, the output line containing the word will be followed by N extra vertical space. If successive requests for extra space apply to the same line, the maximum values are used. The most recently utilized post-line extra line-space is available in the .a register. 3.3. Blocks of vertical space. A block of vertical space is ordinarily requested using sp, which honors the no-space mode and which does not space past a trap. A contiguous block of vertical space may be reserved using Sv. Request Initial If No Form Value Argument Notes Explanation Ys ¥ 1/6in;12pts previous E.p Set vertical base-line spacing size V. Transient extra vertical space available with \x’ V" (see above). Jds N Ne=1 previous E Line spacing set to =N N-1 appended to each output text line. Vs (blank lines) are Appended blank lines are omitted, if the text or previous appended blank line 5-62 Nroff/Troff Users Manual reached a trap position. Ssp N . Nem] V B,y Space vertically in either direction. If N is negative, the Sv N - NeslV v Save a contiguous vertical block of size N. motion is backward (upward) and is limited to the distance to the top of the page. Forward (downward) motion is truncated to the distance to the nearest trap. If the no-space mode is on, no spacing occurs (see ns, and rs below). If the dis- tance to the next trap is greater than N, N vertical space is output. No-space mode has no effect. If this distance is less than N, no vertical space is immediately output, but N is remembered for later output (see os). Subsequent sv requests will overwrite any still remembered M. .0S - . - Qutput saved vertical space. effect. No-space mode has no Used to finally output a block of vertical space requested by an earlier sv request. NS space D - No-space mode turned on. When on, the no-space mode inhibits sp requests and bp requests without a next page aumber. The no-space mode is turned off when a line of output occurs, or with rs. IS space Blank text line. - D Restore spacing. The no-space mode is turned off. - B Causes a break and output of a blank line exactly like sp 1. 6. Line Length and Indenting The maximum line length for fill mode may be set with ll. The indent may be set with in; an indent applicable to only the next output line may be set with ti. The line length includes indent space but nor page offset space. The line-length minus the indent is the basis for centering with ce. The effect of 1I, in, or ti is delayed, if a partially collected line exists, until after that line is output. In fill mode the length of text on an output line is less than or equal to the line length minus the indent. The current line length and indent are available in registers .1 and .I respectively. The length of three-part titles produced by tl (sce 314) is independently set by lt. Request Form Initial Value If No Argument Notes Explanation Jl =N 6.5in previous E.m Line length is set to =N In TROFF the maximum (line-length) + (page-offset) is about 7.54 inches. dn =N Ne=() previous B,E,m Indent is set to £/N. The indent is prepended to each g1 =N . ignored B,E.m Temporary indent. output line. indented a distance indent. The next output text line will be =N with respect to the current The resulting total indent may not be negative. The current indent is not changed. 7. Macros, Strings, Diversion, and Position Traps 7.1. Macros and strings. A macro is a named set of arbitrary lines that may be invoked by name or with a trap. A string is a named string of characters, not including a newline character, that may be interpolated by name at any point. Request, macro, and string names share the same name list. Macro and <eting names may be one or two characters long and may usurp previously defined request, macro, or string names. Any of these entities may be renamed with rn or removed with rm. Macros are created by de and di, and appended to by am and da; di and da cause normal output to be stored in a macro. Strings are created by ds and appended to by as. A macro is invoked in the same way as a request; a Nroff/Troff Users Manual 5-63 control line beginning .xx will interpolate the contents of macro xx. The remainder of the line may contain up to nine arguments. The strings x and xx are interpolated at any desired point with \ex and \*(xx respectively. String references and macro invocations may be nested. 7.2. Copy mode input interpretation. During the definition and extension of strings and macros (not by diversion) the input is read in copy mode. The input is copied without interpretation except that: e The contents of number registers indicated by \n are interpolated. o Strings indicated by \e are interpolated. e Arguments indicated by \$ are interpolated. o Concealed newlines indicated by \(newline) are eliminated. o Comments indicated by \" are eliminated. e \t and \a are interpreted as ASCII horizontal tab and SOH respectively (§9) « \\ is interpreted as \. \. is interpreted as ".". These interpretations can be suppressed by prepending a \. For example, since \\ maps into a \, \\n will copy as \n whnch will be interpreted as a number register indicator when the macro or string is reread. 7.3. Arguments. When a macro is invoked by name, the remainder of the line is taken to contain up to nine arguments. The argument separator is the space character, and arguments may be surrounded by double-quotes to permit imbedded space characters. Pairs of double-quotes may be imbedded in double-quoted arguments to represent a single double-quote. If the desired arguments won’t fit on a line, a concealed newline may be used to continue on the next line. When a macro is invoked the input level is pushed down and any arguments available at the previous level become unavailable until the macro is completely read and the previous level is restored. A macro’s own arguments can be interpolated at any point within the macro with \$#, which interpolates the Nth argument (1< N<9). If an invoked argument doesn’t exist, a null string results. ple, the macro xx may be defined by For exam- de xx \*begin definition Today is \\$1 the \\$2. v \"end definition and called by .Xxx Monday 14th to produce the text Today is Monday the 14th. Note that the \$ was concealed in the definition with a prepended \ The number of currently available arguments is in the .$ register. No arguments are available at the top (non-macro) level in this implementation. Because string referencing is implemented as a input-level push down, no arguments are available from within a string. No arguments are available within a trap-invoked macro. Arguments are copied in copy mode onto a stack where they areavailable for reference. The mechanism does not allow an argument to contain a direct reference to a long string (interpolated at copy time) Lo i B @ Py g, e o o om m and it is advisable to conceal string references (with an extra \) to delay interpolation until argument reference time. 7.4. Diversions. Processed output may be diverted into a macro for purposes such as footnote processing (see Tutorial §TS) or determining the horizontal and vertical size of some text for conditional changing of pages or columns. A single diversion trap may be set at a specified vertical position. The number registers dn and dl respectively contain the vertical and horizontal size of the most recently ended diversion. Processed text that is diverted into a macro retains the vertical size of each of its lines when reread in nofill mode regardless of the current V. Constant-spaced (es) or emboldened (bd) text that is diverted can be reread correctly only if these modes are again or still in effect at reread time. One way 5-64 Nroff/Troff Users Manual to do this is to imbed in the diversion the appropriate ¢s or bd requests with the transparent mechanism described in §10.6. Diversions may be nested and certain parameters and registers are associated with the current diversion level (the top non-diversion level may be thought of as the Oth diversion level). These are the diver- sion trap and associated macro, no-space mode, the internaily-saved marked place (see mk and rt), the current vertical place (.d register), the current high-water text base-line (.h register), and the current diversion name (.z register). 7.5. Traps. Three types of trap mechanisms are available-—page traps, a diversion trap, and an inputline-count trap. Macro-invocation traps may be planted using wh at any page position including the top. This trap position may be changed using ch. Trap positions at or below the bottom of the page have no effect unless or until moved to within the page or rendered effective by an increase in page length. Two traps may be planted at the same position only by first planting them at different positions and then moving one of the traps; the first planted trap will conceal the second unless and until the first one is moved (see Tutorial Examples §TS). If the first one is moved back, it again conceals the second trap. The macro associated with a page trap is automatically invoked when a line of text is output whose vertical size reaches or sweeps past the trap position. Reaching the bottom of a page springs the top-of-page trap, if any, provided there is a next page. The distance to the next trap position is available in the .t register; if there are no traps between the current position and the bottom of the page, the distance returned is the distance to the page bottom. A macro-invocation trap effective in the current diversion may be planted using dt. The .t register works in a diversion; if there is no subsequent trap a large distance is returned. For a description of input-line-count traps, see it below. Regquest Form Initial Value If No Argument Notes Explanation de xxyy - Jy=., - Define or redefine the macro xx The contents of the macro begin on the next input line. Input lines are copied in copy mode until the definition is terminated by a line beginning with .yy, whereupon the macro yy is called. In the absence of yy, the definition is terminated by a line beginning with "..". A macro may contain de requests provided the terminating macros differ or the contained definition terminator is concealed. ".." can be concealed as \\.. which will copy as \.. and be reread as .am xx yy - Y=, . ds xx string - ignored - Append to macro (append version of de). Define a string xx containing string. Any initial doublequote in string is stripped off to permit initial blanks. .4§ XX string - ignored . Append string to string xx (append version of ds). rm ¢ ignored - Remove . request, macro, or string. The name xx is removed from the name list and any related storage space is freed. Subsequent references will have no effect. qn xcyy - ignored - Rename request, macro, or string xx to yy. If yyexists, it is first removed. di xx - end D Divert output to macro xx. Normal text processing occurs during diversion except that page offsetting is not done. The diversion ends when the request di or da is encountered without an argument; extraneous regquests of this type should not appear when nested diversions are being used. Nroff/Troff Users Manual 5-65 .da xx wh Nxx - end D - - 4 Divert, appending to xx (append version of di). Install a trap to invoke xx at page position NV, a negative N will be interpreted with respect to the page botrom. macro previously planted at N is replaced by xx N refers to the top of a page. Any A zero In the absence of xx, the first found trap at &, if any, is removed. ch xx N - . v dt N xx - off D,y Change the trap position for macro xx to be V. absence of N, the trap, if any, is removed. In the Install a diversion trap at position N in the current diversion to invoke macro xx. diversion trap. Another dt will redefine the If no arguments are given, the diversion trap is removed. it N xx - off E Set an input-line-count trap to invoke the macro xx after N lines of text input have been read (control or request lines don’t count). The text may be in-line text or text interpoiated by inline or trap-invoked macros. .em xx norne none - The macro xx will be invoked when all input has ended. The effect is the same as if the contents of xx had been at the end of the last file processed. 8. Number Registers A variety of parameters are available to the user as predefined, named number registers (see Summary and Index, page 7). In addition, the user may define his own named registers. Register names are one or two characters long and do not conflict with request, macro, or string names. Except for certain predefined read-only registers, a number register can be read, written, automatically incremented or decremented, and interpolated into the input in a variety of formats. One common use of user-defined registers is to automatically number sections, paragraphs, lines, etc. A number register may be used any time numerical input is expected or desired and may be used in numerical expressions (§1.4). Number registers are created and modified using nr, which specifies the name, numerical value, and the auto-increment size. Registers are also modified, if accessed with an auto-incrementing sequence. If the registers x and xx both contain NV and have the auto-increment size M, the following access sequences have the effect shown: | Effect on Register Value Interpolated \nx \n (x none none N N \n<+x x incremented by M N+M Sequernce \n=x x decremented by M \n+ (xx | xxincremented by M \n-(xx | xxdecremented by M N—-M N+M N—-M When interpolated, a number register is converted to decimal (default), decimal with leading zeros, lower-case Roman, upper-case Roman, lower-case sequential alphabetic, or upper-case sequential alpha- betic according to the format specified by af. Request Form Initiai Value ArRE=NM [f No Argument Notes Explanation . u The number register R is assigned the value =/N with respect to the previous value, if any. auto-incrementing is set to M. The increment for 5-66 Nroff/Troff Users Manual af Rc¢ - arabic Assign format ¢ to register R. The available formats are: . Numbering Format Sequence 1 001 0,1,2,3,4,5,... | 000,001,002,003,004,005,... 0,1,11,113,1v,v,... ... o.IILIIIV,V 0.a.b,c,...,z,aa,ab,...,zz,aaa,... 0,A.B.C.....Z,AA AB.....ZZ . AAA... i I a A An arabic format having N digits specifies a field width of N digits (example 2 above). The read-only registers and the width function (§11.2) are always arabic. Ir R ignored - Remove register R. If many registers are being created o dynamically, it may become necessary to remove no longer used registers to recapture internal storage space for newer registers. 9. Tabs, Leaders, and Fields 9.]. Tabs and leaders. The ASCII horizontal tab character and the ASCII SOH (hereafter known as the leader character) can both be used to generate either horizontal motion or a string of repeated charac- ters. The length of the generated entity is governed by internal tab stops specifiable with ta. The default difference is that tabs generate motion and leaders generate a string of periods; tc and lc offer the choice of repeated character or motion. There are three types of internal tab stops—/eft adjusting, right adjusting, and centering. In the following table: D is the distance from the current position on the input line (where a tab or leader was found) to the next tab stop; next-string consists of the input charac- ters following the tab (or leader) up to the next tab (or leader) or end of line; and W is the width of next-string. Length of motion or Location of type repeated characters next-string Left Right D D—W D—W/? Following D Right adjusted within D Centered on right end of D Tab Centered The length of generated motion is allowed to be negative, but that of a repeated character string cannot be. Repeated character strings contain an integer number of characters, and any residual distance IS prepended as motion. Tabs or leaders found after the last tab stop are ignored, but may be used as next-string terminators. Tabs and leaders are not interpreted in copy mode. \t and \a always generate a non-interpreted tab and leader respectively, and are equivalent to actual tabs and leaders in copy mode. 9.2. Fields. A field is contained between a pair of field delimiter characters, and consists of sub-strings separated by padding indicator characters. The field length is the distance on the input line from the position where the field begins to the next tab stop. The difference between the total length of all the sub-strings and the field length is incorporated as horizontal padding space that is divided among the indicated padding places. The incorporated padding is allowed to be negative. For example, if the field delimiter is # and the padding indicator is °, # xxx"right# specifies a right-adjusted string with the string xxx centered in the remaining space. | Nroff/Troff Users Manual 5-67 Request Form Initial Value If No Argument Notes ta Nt ... 0.8; 0.5in none E.m Explanation Set tab stops and types. (=R, centering; ¢ absent, left adjusting. right adjusting; ¢=C, TROFF tab stops are preset every 0.5in.; NROFF every 0.8in. The stop values are separated by spaces, and a value preceded by <+ treated as an increment to the previous stop value. tc ¢ none none E is The tab repetition character becomes ¢, or is removed specifying motion. ' dec . . none E The leader repetition character becomes ¢, or is removed specifying motion. fcab off off - The field delimiter is set to a; the padding indicator is set to the space character or to b, if given. In the absence of arguments the field mechanism is turned off. 10. Input and Output Conventions and Character Translations 10.1. Input character translations. Ways of inputting the graphic character set were discussed in §2.1. The ASCII control characters horizontal tab (§9.1), SOH (§9.1), and backspace (§10.3) are discussed elsewhere. The newline delimits input lines. In addition, STX, ETX, ENQ, ACK, and BEL are accepted, and may be used as delimiters or translated into a graphic with tr (§10.5). A4/l others are ignored. The escape character \ introduces escape sequences—causes the following character to mean another character, or to indicate some function. A complete list of such sequences is given in the Summary and Index on page 6. \ should not be confused with the ASCII control character ESC of the same name. The escape character \ can be input with the sequence \\. The escape character can be changed with ec, and all that has been said about the default \ becomes true for the new escape character. used to print whatever the current escape character is. \e can be If necessary or convenient, the escape mechan- ism may be turned off with eo, and restored with ec. Regquest Form Initial Value Argument If No Notes Explanation .ec ¢ \ \ . Set escape-character to \, or to ¢, if given. .80 on - - Turn escape mechanism off. 10.2. Ligatures. Five ligatures are available in the current TROFF character set — fi, fl, ff, fii, and fH. They may be input (even in NROFF) by \(fi, \(fl, \(ff, \(Fi, and \(F] respectively. The ligature mode is normally on in TROFF, and automatically invokes ligatures during input. Request Initial If No Form Value Argument Notes Explanation Jdg N off; on on - Ligature mode is turned on if N is absent or non-zero. and turned off if NM=0. If N=2, only the two-character ligatures are automatically invoked. Ligature mode is inhibited - 4 g for L4 request, S W el B G macro, names, and in copy mode. 10.3. Backspacing, underlining, overstriking, etc. string, or file Unless in copy mode, the ASCII backspace character is replaced by a backward horizontal motion having the width of the space character. form of line-drawing is discussed in §12.4. register, No effect in NROFF. Underlining as a A generalized overstriking function is described in $12.1. NROFF automatically underlines characters in the underline font, specifiable with uf, normally that on font position 2 (normally Times [talic, see §2.2). selected by ul and cu. characters. In addition to ft and \fF the underline font may be Underlining is restricted to an output-device-dependent subset of reasonable 5-68 Nroff/Troff Users Manual If No Regquest Initial Value Argument Notes al v off Nass | E Form Explanation Underline in NROFF (italicize in TROFF) the next N input text lines. Actually, switch to underline font, saving the current font for later restoration; other font changes within the span of a ul will take effect, but the restora- tion will undo the last change. Output generated by tl (§14) is affected by the font change, but does not decre- ment N If ¥>1, there is the risk that a trap interpolated macro may provide text lines within the span; environment switching can prevent this. cu VY off Na= ] E uf F [talic [talic - A variant of ul that causes every character to be under- lined in NROFF. Identical to ul in TROFF. Underline font set to F. In NROFF, F may not be on position 1 (initially Times Roman). 10.4. Control characters. Both the control character . and the no-break control character * may be changed, if desired. Such a change must be compatible with the design of any macros used in the span of the change, and particularly of any trap-invoked macros. Request Initial .CC ¢ £2c Form [f No Argument Notes Explanation . . E The basic control character is set to ¢, or reset to ".". ’ ’ E Value The nobreak control character is set to ¢, or reset to "TM". 10.5. Output translation. One character can be made a stand-in for another character using tr. All text processing (e. g. character comparisons) takes piace with the input (stand-in) character which appears to have the width of the final character. The graphic translation occurs at the moment of output (including diversion). If No Request Initial Form Value Argument Notes Explanation Ar abcd.... none - O Translate a into b, ¢ into 4, etc. If an odd number of characters is given, the last one will be mapped into the space character. To be consistent, a particular transiation must stay in effect from input to outpus time. 10.6. Transparent throughput. An input line beginning with a \! is read in copy mode and transparently output (without the initial \!); the text processor is otherwise unaware of the line's presence. This mechanism may be used to pass control information to a post-processor or to imbed control lines in a macro created by a diversion. 10.7. Comments and concealed newlines. An uncomfortably long input line that must stay one line (e. g. a string definition, or nofilied text) can be split into many physical lines by ending ail but the last one with the escape \. The sequence \(newline) is a/ways ignored—except in a comment. Comments may be imbedded at the end of any line by prefacing them with \". The newline at the end of a comment cannot be concealed. A line beginning with \* will appear as a blank line and behave like .sp 1; a comment can be on a line by itself by beginning the line with .\". 11. Local Horizontal and Vertical Motions, and the Width Function 11.1. Local Motions. The functions \v'V° and \h’N’ can be used for local vertical and horizontal motion respectively. The distance N may be negative; the positive directions are rightward and downward. A local motion is one contained within a line. To avoid unexpected vertical dislocations, it 1S necessary that the net vertical local motion within a word in filled text and otherwise within a line balance to zero. The above and certain other escape sequences providing local motion are summarized in the following lable. Nroff/Troff Users Manual 5-69 Vertical Local Motion Effect in TROFF Horizontal NROFF \v'N° Move distance \u 4 em up \d ' em down | 2 line down \r 1 em up Y4 line up 1 line up Effect in Local Motion \h'N’ \(space) \0 \| TROFF NROFF Move distance N | Unpaddable space-size space Digit-size space 1/6 em space | ignored \" 1/12 em space | ignored As an example, E2 could be generated by the sequence E\s=2\v'=0.4m"2\v'0.4m"\s+2; it should be noted in this example that the 0.4 em vertical motions are at the smaller size. 11.2. Width Function. The width function \w’'string’ generates the numerical width of string (in basic units). Size and font changes may be safely imbedded in string, and will not affect the current environment. For example, .ti =\w’l. ‘u could be used to temporarily indent leftward a distance equal to the size of the string "1. ". The width function also sets three number registers. The registers st and sb are set respectively to the highest and lowest extent of siring relative to the baseline; then, for example, the total height of the string is \n(stu=\n(sbu. In TROFF the number register ct is set to a value between 0 and 3: 0 means that all of the characters in string were short lower case characters without descenders (like e); 1 means that at least one character has a descender (like y); 2 means that at least one character is tall (like H); and 3 means that both tall characters and characters with descenders are present. 11.3. Mark horizontal place. The escape sequence \kx will cause the current horizontal position in the input line to be stored in register x. As an example, the construction \kxword\h"|\nxu+2u’word will embolden word by backing up to almost its beginning and overprinting it, resulting in word, 12. Overstrike, Bracket, Line-drawing, and Zero-width Functions 12.1. Overstriking. Automatically centered overstriking of up to nine characters is provided by the overstrike function \o’string’. The characters in string overprinted with centers aligned; the total width is that of the widest character. string should not contain local vertical motion. duces &, and \o"\(mo\ (sl’ produces £. As examples, \o’e\’" pro- 12.2. Zero-width characters. The function \zc will output ¢ without spacing over it, and can be used to produce left-aligned overstruck combinations. As examples, \z\(ci\(pl will produce &, and \ (br\z\ (rn\ (ul\ (br will produce the smallest possible constructed box | . 12.3. Large Brackets. The Special Mathematical Font contains a number of bracket construction pieces C{UYJ{ LT ) that can be combined into various bracket styles. The function \b'string” may be used to pile up vertically the characters in string (the first character on top and the last at the bottom); the characters are vertically separated by 1 em and the total pile is centered 1/2em above the current base- line (‘2 line in NROFF). For example, \b"\ (Ie\(f "E\|\b"\ (re\ (rf "\x" =0.5m"\x'0.5m" produces [E} 12.4. Line drawing. The function \1°N¢’ will draw a string of repeated ¢’s towards the right for a distance N. (\l is \(lower case L). If ¢ looks like a continuation of an expression for », it may insulated from N with a \&. If cis not specified, the _ (baseline rule) is used (underline character in NROFF). If N is negative, a backward horizontal motion of size N is made before drawing the string. Any space resulting from N/(size of ¢) having a remainder is put at the beginning (left end) of the string. In the L] L] ® & [ [3 Y case of characters that are designed to be connected such as baseline-rule _, underrule _, and root- en , the remainder space is covered by over-lapping. If Nis less than the width of ¢, a single ¢ is centered on a distance N. .de us \\$1\1°]0\ (ui’ @ As an example, a macro to underscore a string can be written 5-70 Nroff/Troff Users Manual or one to draw a box around a string .de bx "\ 170N (ul” |\ (Br\ 17O\ Gen\ \(be\\\SI such that .u] "underlined words” and .bx "words in a box" vield underlined words and [words in a box. The function \L* N¢” will draw a vertical line consisting of the (optional) character ¢ stacked vertically apart 1 em (1 line in NROFF), with the first two characters overlapped, if necessary, to form a continuous line. The default character is the box rule | (\(br); the other suitable character is the bold verrical | (\(bv). The line is begun without any initial motion relative to the current base line. A positive N specifies a line drawn downward and a negative N specifies a line drawn upward. After the line is drawn no compensating motions are made: the instantaneous baseline is at the end of the line. The horizontal and vertical line drawing functions may be used in combination to produce large boxes. The zero-width box-rule and the Ys-em wide underrule were designed to form corners when using l-em vertical spacings. For example the macro .de eb .sp —1 .nf \"compensate for next automatic base-line spacing \"avoid possibly overflowing word buffer \h’'=.5n"\L \\nau=1"\I"\\nC.lu+1n\(ul"\L’ = |\\nau+1"\1'|0u=.5n\(ul®° \"draw box fi will draw a box around some text whose beginning vertical place was saved in number register a (e. g. using .mk a) as done for this paragraph. 13. | Hyphenation. The automatic hyphenation may be switched off and on. When switched on with hy, several variants may be set. A hyphenation indicator character may be imbedded in a word to specify desired hyphenation points, or may be prepended to suppress hyphenation. In addition, the user may specify a small exception word list. Only words that consist of a central alphabetic string surrounded by (usually null) non-alphabetic strings are considered candidates for automatic hyphenation. Words that were input containing hyphens (minus), em-dashes (\(em), or hyphenation indicator characters—such as mother-in-law—are always subject to splitting after those characters, whether or not automatic hyphenation is on or off. Regquest Initial If No Form Value Argument Notes Explanation .nh hyphenate - E Automatic hyphenation is turned off. hy N on, N=| on,N=| E Automatic hyphenation is turned on for N=0. ¥ =1, or off for If N=2 last lines (ones that will cause a trap) are not hyphenated. For =24 and 8, the last and first two characters respectively of a word are not split off. These values are additive; i.e. N=14 will invoke all three restrictions. he ¢ \% Jhw word] ... \% E Hyphenation indicator character is set to ¢ or to the defauit \%. The indicator does not appear in the output. ignored - Specify hyphenation minus signs. points in words with imbedded Versions of a word with terminal s are Nroff/Troff Users Manual 5-71 implied; i. e. dig—it implies dig—its. This list is exam- ined initiaily and after each suffix stripping. The space available is small—about 128 characters. 14. Three Part Titles. The titling function tl provides for automatic placement of three fields at the left, center, and right of a line with a title-length specifiable with It. text collecting process. tl may be used anywhere, and is independent of the normal A common use is in header and footer macros. Regquest Initial {f No Form Value Argument Notes Explanation - . The strings left, t1 “left’ center’ right’ adjusted, center, and right are respectively left- centered, title-length. and right-adjusted in the current Any of the strings may be empty, and over- lapping is permitted. If the page-number character (ini- tially %) is found within any of the fields it is replaced by the current page number having the format assigned to Any character may be used as the string del- register %. imiter. pc ¢ % off - The page number character is set to ¢, or removed. The page-number register remains %. Jdt =N 6.5in previous E.m Length of title set to =N. length are independent. The line-length and the title- Indents do not apply to titles: page-offsets do. 15. Output Line Numbering. Automatic sequence numbering of output lines may be requested with nm. When in effect, a three-digit, arabic number plus a digit-space is prepended to output text lines. The text lines are 3 thus offset by four digit-spaces, and otherwise retain their line length; a reduction in line length may be desired to keep the right margin aligned with an earlier margin. Blank lines, other vertical spaces, and lines generated by tl are not numbered. Numbering can be temporarily suspended with 6 nn, or with an .nm followed by a later .nm +0. In addition, a line number indent /. and the number-text separation S may be specified in digit-spaces. Further, it can be specified that only those line numbers that are multipies of some number M are to be printed (the others will appear 9 as blank number fields). Reguest Initial If No Form Value Argument Notes off E nam =NMS/ Explanation Line number mode. If +/V is given, line numbering is turned on, and the next output line numbered is numbered =/N. Default values are M=],6 S=1, and /=0. Parameters corresponding to missing arguments are unaffected; a non-numeric argument is considered missing. In the absence of all arguments, numbering is turned off; the next line number is preserved for possible further use in number register In. an N . Noa=] E The next N text output lines are not numbered. As an example, the paragraph portions of this section are numbered with M=3: .nm 1 3 was placed at the beginning, .nm was placed at the end of the first paragraph; and .nm +0 was placed 12 in front of this paragraph; and .nm finally placed at the end. \w'0000°u) to keep the right side aligned. Line lengths were also changed (by Another example is .nm +5 5 x 3 which turns on numbering with the line number of the next line to be 5 greater than the last numbered line, with 15 M=35, with spacing S untouched, and with the indent /set to 3. 5-72 Nroff/Troff Users Manual 16. Conditional Acceptance of Input In the following, ¢ is a one-character, built-in condition name, ! signifies not, N is a numerical expres- sion, string/ and stringl are strings delimited by any non-blank, non-numeric character no¢ in the strings. and anyrhing represents what is conditionally accepted. Request Initial If No Form Value Argument Notes - - if ¢ anvthing - . If condition c false, accept anyrhing. Af N anything - u If expression N > 0, accept anything. Af 1N anything - u If expression N < 0, accept anything. Af “stringl string2” anything . If stringl identical to string2, accept anything, Af ! stringl stringl” anything - If setringl not identical to string2, accept anything. e ¢ anything . u If portion of if-else; all above forms (like if). .el anything - . Else portion of if-else. Af ¢ anything | | Explanation If condition ¢ true, accept anything as input; in multi-line case use \{anything\}. The built-in condition names are: Condition Name True If o Current page number is odd e Current page number is even t Formatter is TROFF n Formatter is NROFF If the condition c is true, or if the number N is greater than zero, or if the strings compare identically (including motions and character size and font), anything is accepted as input. If a ! precedes the ccndi- tion, number, or string comparison, the sense of the acceptance is reversed. Any spaces between the condition and the beginning of anything are skipped over. The anything can be either a single input line (text, macro, or whatever) or a number of input lines. In the multi-line case, the first line must begin with a left delimiter \{ and the last line must end with a right delimiter \}. The request ie (if-else) is identical to if except that the acceptance state is remembered. A subsequent and matching el (else) request then uses the reverse sense of that state. ie - el pairs may be nested. Some examples are: .if e .t]l "Even Page %"’ which outputs a title if the page number is even; and de \n%>1 \{\ ‘e N &% SPp tl v.oil "Page %'"° sp|1.2i \} el .sp|2.5i which treats page 1 differently from other pages. 17. Environment Switching. A number of the parameters that control the text processing are gathered together into an environment. which can be switched by the user. The environment parameters are those associated with requests noting E in their Notes column; in addition, partially collected lines and words are in the environment. Everything else is global; examples are page-oriented parameters, diversion-oriented parameters. Nroff/Troff Users Manual 5-73 number registers, and macro and string definitions. All environments are initialized with default parameter values. Regquest Form - Initial Value If No Argument Notes Explanation ev N Na=() previous - Environment switched to environment 0 < N<2. Switch- ing is done in push-down fashion so that restoring a previous environment must be done with .ev rather than specific reference. 18. Insertions from the Standard Input The input can be temporarily switched to the system standard input with rd, which will switch back when two newlines in a row are found (the extra blank line is not used). This mechanism is intended for insertions in forme-letter-like documentation. On UNIX, the standard input can be the user’s key- board, a pipe, or a file. Request Form Initial Value If No Argument .xrd prompt - prompt =BEL - Notes Explanation Read insertion from the standard input until two newlines in a row are found. If the standard input is the user’s keyboard, prompt (or a BEL) is written onto the user’s terminal. rd behaves like a macro, and arguments may be placed after prompt. .ex . . . Exit from NROFF/TROFF. Text processing is terminated exactly as if all input had ended. If insertions are to be taken from the terminal keyboard while output is being printed on the terminal, the command line option —q will turn off the echoing of keyboard input and prompt only with BEL. The regular input and insertion input cannot simultaneously come from the standard input. As an example, multiple copies of a form letter may be prepared by entering the insertions for all the copies in one file to be used as the standard input, and causing the file containing the letter to reinvoke itself using nx (§19); the process would ultimately b;es,\;f,pndcd by an ex in the insertion file. 19. Input/Output File Switching Regquest Form Initial Value .80 filename If No Argument Notes Explanation . . Switch source file. The top input (file reading) level is switched to filename. The effect of an so encountered in a macro is not feit until the input level returns to the file level. When the new file ends, input is again taken from the original file. .nx filename end-of-file - Next file so’s may be nested. is filename. The current file is considered ended, and the input is immediately switched to filename. .pi program - - Pipe output to program (NROFF only). must occur before any printing occurs. This request No arguments are transmitted to program. 20. Miscellaneous Request Form Initial Value If No Argument Notes Explanation .mec ¢ N - off E,m Specifies that a margin character ¢ appear a distance N to the right of the right margin after each non-empty text line (except those produced by tl). too-lor If the output line is ‘s can happen in nofill mode) the character will 5-74 Nroff/Troff Users Manual be appended to the line. If NV is not given, the previou: N is used; the initial M is 0.2 inches in NROFF and len in TROFF. The margin character used with this para graph was a 12-point box-rule. .tm string - newline - After skipping initial blanks, string (rest of the line) i read in copy mode and written on the user’s terminal. g yy . Yy, . Ignore input lines. ig behaves exactly like de (§7) excep that the input is discarded. mode, and any The input is read in cop auto-incremented registers will be affected. .pm ¢ - all - Print macros. The names and sizes of all of the definec macros and strings are printed on the user’s terminal; if is given, only the total of the sizes is printed. The size: is given in blocks of 128 characters. f1 - - B Flush output buffer. Used in interactive debugging t« force output. 21. Output and Error Messages. The output from tm, pm, and the prompt from rd, as well as various error messages are written ont UNIX's standard message output. The latter is different from the standard output, where NROFF format ted output goes. By default, both are written onto the user’s terminal, but they can be independent! redirected. Various error conditions may occur during the operation of NROFF and TROFF. Certain less seriou errors having only local impact do not cause processing to terminate. Two examples are word overflown caused by a word that is too large to fit into the word buffer (in fill mode), and line overflow, caused b an output line that grew too large to fit in the line buffer; in both cases, a message is printed, th offending excess is discarded, and the affected word or line is marked at the point of truncation with a in NROFF and a = in TROFF. The philosophy is to continue processing, if possible, on the ground that output useful for debugging may be produced. If a serious error occurs, processing terminates, an an appropriate message is printed. Examples are the inability to create, read, or write files, and th exceeding of certain internal limits that make future output unlikely to be useful. Nroff/Troff Users Manual 5-75 TUTORIAL EXAMPLES T1. initial pseudo-page transition (§3). Introduction Although NROFF and TROFF have by design a syntax reminiscent of earlier text processors® with the intent of easing their use, it is almost always necessary to prepare at least a small set of macro definitions to describe most documents. Such common formatting needs as page margins and footnotes are deliberately not built into NROFF and TROFF. Instead, the macro and string definition, number register, diversion, environment switching, page-position trap, and conditional input mechanisms provide the basis for user-defined implementations. The examples to be discussed are intended to be useful and somewhat realistic, but won’t necessarily cover all relevant contingencies. Explicit numerical parameters are used in the examples to make them easier to read and to illustrate typical In many cases, number registers would values. In fill mode, the output line that springs the footer trap was typically forced out because some part or whole word didn’t fit on it. If anything in the footer and header that follows causes a break, that word or part word will be forced out. In this and other examples, requests like bp and sp that normally cause breaks are invoked using the no-break con- trol character ° to avoid this. When the header/footer design contains material requiring independent text processing, the environment may be switched, avoiding most interaction with the running text. A more realistic example would be .de hd \"header Af t .t "\(m "\ (rn° \"troff cut mark Af \\n%>1 \{\ "sp |0.5i—1 \"tl base at 0.5i Al = Y = \"centered page number really be used to reduce the number of places .ps where numerical information is kept, and to con- St centrate conditional parameter “initialization like that which depends on whether TROFF or NROFF s \} "sp |1.0i is being used. .ns \"turn on no-space mode de fo \"footer .ps 10 ft R \"set footer/header size \"set font T2. Page Margins As discussed in §3, header and footer macros are usually defined to describe the top and bottom page margin areas respectively. A trap is planted at page position 0 for the header, and at =N (N from the page bottom) for the footer. The simplest such definitions might be .de hd ‘sp 1i . de fo .vs 12p \"set base-line spacing Af \\n% =1 \{\ sp [\\n(.pu=0.5i—1 \"tl base 0.5i up Al = % =""\} \"first page number .wh 0 hd \"end definition \"define footer bp \"restore vs \"space to 1.0i bp \"define header . \"restore size \"restore font .wh —1i fo which sets the size, font, and base-line spacing \"end definition for the .wh 0 hd header/footer material, restores them. wh =1i fo and ultimately The material in this case is a page number at the bottom of the first page and at the which provide blank 1 inch top and bottom mar- top of the remaining pages. gins. cut mark is drawn in the form of root-en’s at each The header will occur on the first page, only if the definition and trap exist prior to the margin. avoid *For example: P. A. Crisman, Ed., The Comparible Time- If TROFF is used, a The sp’s refer to absolute positions to dependence on the base-line spacing. Another reason for this in the footer is that the Sharing System, MIT Press, 1965, Section AH9.01 (Descrip- footer is invoked by printing a line whose vertical tion of RUNOFF program on MIT’s CTSS system). spacing swept past the trap position by possibly as 5-76 Nroff/Troff Users Manual much as the base-line spacing. The no-space mode is turned on at the end of hd to render ineffective accidental occurrences of sp at the top of the running text. The above method of restoring size, font, etc. presupposes that such requests (that set previous value) are nor used in the running text. A better The prespacing parameter is suitable for TROFF; a larger space, at least as big as the output device vertical resolution, would be more suitable in NROFF. The choice of remaining space to test for in the ne is the smallest amount greater than one line (the .V is the available vertical resolution). scheme is save and restore both the current and A macro to automatically number section head- previous values as shown for size in the follow- ings might look like: ing: .de sc \"section ., ee= \"force font, etc. .sp 0.4 \"prespace .ne 2.4+\\n(.Yu \"want 2.4+ lines .de fo .nr s1 \\n(.s \"current size .PS .ar s2 \\n(.s , oma \"previous size \"rest of footer Si \\n+S. arSo01 .de hd . o= .ps \\n(s2 .ps \\n(si \"header stuff \"restore previous size \"restore current size \"init S The usage is .sc, followed by the section heading text, followed by .pg. The ne test value includes one line of heading, 0.4 line in the following pg, and one line of the paragraph text. A word con- Page numbers may be printed in the bottom mar- sisting of the next section number and a period is gin by a separate macro footer’s page ejection: of the number may be set by af (38). .de bn Al - Y - triggered during the \"bottom number \"centered page number produced to begin the heading line. The format Another common form is the labeled, indented paragraph, where the label protrudes left into the indent space. .wh —=0.5i=1v bn \"tl base 0.5i up T3. .de lp -Pg Paragraphs and Headings The housekeeping associated with starting a new paragraph macro should that, preparagraph for be collected example, spacing, forces in does a paragraph the requests a temporary indent. | \"paragraph indent .ta 0.2i 0.5 \"label, paragraph \t\\S$1\t\e font, size, base-line spacing, and indent, checks that .in 0.5i 41 0 desired the correct enough space remains for more than one line, and \"flow into paragraph The intended usage is ".Ip label"; label will begin at 0.2inch, 0.3inch .de pg \"paragraph br \"break t R \"force font, .ps 10 \"size, .vs 12p \"spacing, .in 0 \"and indent .sp 0.4 \"prespace .ne 1+\\n(.Vu \"want more than 1 line i 0.21 \"labeled paragraph and without cannot exceed a intruding into the length of paragraph. The label could be right adjusted against 0.4 inch by setting the tabs instead with .ta 0.4iR 0.5i. The last line of lp ends with \¢ so that it will become a part of the first line of the text that follows. T4. Muitiple Column Output The \"temp indent production of muitiple column pages requires the footer macro to decide whether it was invoked by other than the last column, so The first break in pg will force out any previous partial lines, and must occur before the vs. The forcing of font, etc. is partly a defense against prior error and partly to permit things like section heading macros to set parameters only once. that it will begin a new column rather than produce the bottom margin. The header can initial- ize a column register that the footer will increment and test. The following is arranged for two columns, but is easily modified for more. Nroff/Troff Users Manual 5-77 ev \} \"pop environment .de hd \"header nrclo1l .mk \"init column count \"mark top of text .de fx \"process footnote overflow .de fo \"footer Af\\nx .di fy \"divert overflow .po +3.4i It \"next column: 3.1+0.3 .de fn \"start footnote \"back to mark .ns \} el \{\ .po \\nMu \"no-space mode .da FN ey 1 \"restore left margin Af\\n+x=1 .fs \"if first, include separator fi \"fill mode de\\n+ (cl< 2 \{\ "bp \} A1 3.1 .nr M\\n(.o \"column width \"save left margin Footnote Processing The footnote mechanism to be described is used by imbedding the footnotes in the input text at the point of reference, demarcated by an initial .de fs \I' 1§’ fn and a terminal .ef: br fn : de fz Footnote rext and control lines... .nf fy In the following, footnotes are processed in a separate environment and diverted for later printing in the space immediately prior to the bottom margin. There is provision for the case .ef where the last collected footnote doesn’t com- .wh 0 hd pletely fit in the available space. .wh 12i fo \"header .arx 01 .nar Yy 0=\\nb .ch fo =\\nbu Af\\n{dn .fz \"init footnote count \"current footer place \"reset footer trap \"leftover footnote .de fo .nr dn 0 Af \\nx \ [\ \"footer \"zero last diversion size \"separator \"1 inch rule \"get leftover footnote fn ef .de hd \"divert (append) footnote \"in environment 1 .de ef \"end footnote br \"finish output .nrz\\n(.v \"save spacing .ev \"pop ev .di \"end diversion .nry =\\n(dn \"new footer position, Jdf\\nx=1 .nry —O\\n(v-=\\n2) \ \"uncertainty correction .ch fo\\nyu \"y is negative Af (\\n(nl+1v)> (\\n(.p+\\ny) \ .ch fo\\n(nlu+1v \"it didn’t fit Typically a portion of the top of the first page contains fuil width text; the request for the narrower line length, as well as another .mk would be made where the two column output was to begin. TS. ~ .nr b 1.0i \"retain vertical size \"where fx put it \"bottom margin size \"header trap \"footer trap, temp position .wh =\\nbu fx\"fx at footer pasition .ch fo =\\nbu \"conceal fx with fo The header hd initializes a footnote count register X, and sets both the current footer trap position register y and the footer trap itself to a nom- inal position specified in register b. In addition, if the register dn indicates a leftover footnote, fz iIs invoked to reprocess it. The footnote start macro fn begins a diversion (append) in environment 1, and increments the count x; if the count ev 1 .nf \"expand footnotes in evl \"retain vertical size FN .rm FN \"footnotes \"delete it Af "\\n(.2°fy" .di \"end overflow diversion .nr x 0 \"disable fx is one, the footnote separator fs is interpolated. The separator is kept in a separate macro to permit user redefinition. The footnote end macro ef restores the previous environment and ends the diversion after saving the spacing size in register Z. y is then decremented by the size of the 5-78 Nroff/Troff Users Manual footnote, available in dn; then on the first footnote, y is further decremented by the difference in vertical base-line spacings of the two environments, to prevent the late triggering the footer trap from causing the last line of the combined footnotes to overflow. The footer trap is then set to the lower (on the page) of y or the current page position (nl) plus one line, to allow for printing the reference line. If indicated by x, the footer fo rereads the footnotes from FN in nofill mode in environment !, and deletes FN. If the footnotes were too large to fit, the macro fx will be trap-invoked to redivert the overflow into fy, and the register dn will later header whether fy is empty. indicate to the Both fo and fx are planted in the nominal footer trap position in an order that causes fx to be concealed unless the fo trap is moved. The footer then terminates the overflow diversion, if necessary, and zeros x to disable fx, together because with a the uncertainty not-too-late correction triggering of the footer can result in the footnote rereading finishing before reaching the fx trap. A good exercise for the student is to combine the multiple-column and footnote mechanisms. T6. The Last Page After the last input file has ended, NROFF and TROFF invoke the end macro (§7), if any, and when it finishes, eject the remainder of the page. During the eject, any traps encountered are processed normally. At the end of this last page, processing terminates unless a partial line, word, or partial word remains. If it is desired that another page be started, the end-macro .de en \¢ \"end-macro bp £ en will deposit a another last page. null partial word, and effect Nroff/Troff Users Manual Table I Font Style Examples The following fonts are printed in 12-point, with a vertical spacing of 14-point, and with nonalphanumeric characters separated by 4 em space. The Special Mathematical Font was specially prepared for Bell Laboratories by Graphic Systems, Inc. of Hudson, New Hampshire. The Times Roman, ltalic, and Bold are among the many standard fonts available from that company. Times Roman abcdefghijklmnopqgrstuvwxyz ABCDEFGHIJKLMNOPQRSTUVWXYZ 1234567890 18% & () "*+ —.,/::=2[]] e —--_hAUAfFHifM°oT ¢®C Times [talic abcdefghijkimnopqrstuvwxyz ABCDEFGHIJKLMNOPQRSTUVWXYZ 1234567890 ISH& ()t = )= 2] ~D~--%%%fiflmfifi preee Times Bold abcdefghijklmnopqrstuvwxyz ABCDEFGHIJKLMNOPQRSTUVWXYZ 1234567890 1$% & ()’ *+ — .,/ =21]] o] —-_YahsfififfhfA°t ¢2C Special Mathematical Font "\ /< >{|#@+—=» &fi‘y&igfigcnA;.LVgUTfiO‘?TUéXQ’iw TAOAZNIYOPVQ VvV Z2S$E~=F——1|X+xUNCDC2=) §V-[coctmwwm@ Of|)){H LI 5-79 5-80 Nroff/Troff Users Manual Table 11 Input Naming Conventions for °, ,and and for Non-ASCII Special Characters Non-ASCII characters and minus on the standard fonts. 3/4 Em dash - hyphen or — e \(hy \= \(bu hyphen current font minus bullet O -~ Ve o Yo \(sq \ru \(14 \N(12 \(34 square rule 1/4 1/2 3/4 B> \(em - -+ — & open quote 8 close quote ¢ Input Character Char Name Name o Input Character Char Name Name \(i \(1 \(f \(Fi \(FI \(de \(dg \(fm \(ct \(rg \(co A 1 ff ffi degree dagger foot mark cent sign registered copyright Non-ASCII characters and °, °, _y +, =, =, and « on the special font. The ASCII characters @, #, ", , ', <, >, \, {, ], °, %, and _ exist only on the special font and are printed as a 1-em space if that font is not mounted. The following characters exist only on the special font except for the upper case Greek letter names followed by t which are mapped into upper case English letters in whatever font is mounted on font position one (default Times Roman). The special math plus, minus, and equals are provided to insulate the appearance of equations from the choice of standard fonts. math minus math equals math star section acute accent _ / \(ga \Mul \(sl grave accent underrule B \(*b a \(*a y & \(*g \(*d { \(®z e n 8 ¢ \(®e slash (matching backslash) alpha Dbeta gamma delta epsilon zeta \(®% \(*h eta theta \(*i iota > X \(mi \(eq \(** \(sc \(aa R -~ = = ovaT math plus J \(pl QU + Input Character Char Name Name 4 v Character WP &Xx 6 ¢ Input Char Name Name \(*k \(*! \(*m \(*n \(®*c \(*0 \(°*p kappa lambda mu nu xi omicron pi \(*s \(ts sigma terminal sigma \(*t tau \(*u \(*f \(*x \(*q \(*w \(*A \(*B upsilon phi chi psi omega Alphat Betat \(*r rho Nroff/Troff Users Manual 5-81 _DOEX@e~<AMOoOO0OMZL >R —~O LI NMP T \(*G \(*D ne A —t defuUuNUNDCH +x—— 1T | XL RIAV Char Name Name = >= \(> < \( == £ == \ (e o identically equal \ (o= approx = approximates \(ap not equal == \ (! arrow right => \( arrow left < \( up arrow \(ua Gamma Delta \(*E \(*Z \(*Y \(*H \(°1 \V("K \(*L \(*M \(*N \(*C \(*O \(°P Epsilont Zetat Etat Theta Iotat \(*R Rhot \(*S \(*T Sigma Kappat Lambda Mut Nut Xi Omicront Pi Taut \(*U Upsilon \(*F \V(*X \(*Q \(*W \(sr Chift Psi Omega Phi square root \(rn root en extender \(da \(mu \(di \(+~\(cu \(ca \(sb down arrow \(sp \(ib \(ip \ (if \(pd \(gr multiply divide plus-minus cup (union) cap (intersection) subset of superset of improper subset improper superset | infinity partial derivative gradient \(no \(is not \(es \(mo empty set \(pt integral sign proportional to member of Input Character Char Name Name 0~ @4\ Character sl ooy Stam Input \(br box vertical rule \(dd double dagger right hand \(th \(bs \(or left hand \(rh Bell System logo or \ (ci \ (1t \(1b \ (rt \(rb \ (1k \(rk \(bv circle \ (rf \(lc \(rc right floor (right bottom) left ceiling (left top) right ceiling (right top) \(If left top of big curly bracket left bottom right top right bot left center of big curly bracket right center of big curly bracket bold vertical left floor (left bottom of big square bracket) 5-82 Nroff/Troff Users Manual Summary of Changes to N/TROFF Since October 1976 Manual Options -t (Nroff only) OQutput tabs used during horizontal spacing to speed output as well as reduce output byte count. Device tab settings assumed to be every 8 nominal character widths. The default settings of input (logical) tabs is also initialized to every 8 nominal character widths. Efficiently suppresses formatted output. and diagnostics). Only message output will occur (from "tm"s Old Requests .ad ¢ The adjustment type indicator "¢” may now also be a number previously obtained from the ".j" register (see below). S0 name The contents of file "name"” will be interpolated at the point the "so” is encountered. Previously, the interpolation was done upon return to the file-reading input level. New Request .ab text Prints "text" on the message output and terminates without further processing. If "text” is missing, "User Abort.” is printed. Does not cause a break. The output buffer is flushed. fz FN forces font "F" to be in size N. N may have the form N, +N, or -N. For example, fz 3 -2 will cause an implicit \s-2 every time font 3 is entered, and a corresponding \s+2 when it is left. Special font characters occurring during the reign of font F will have the same size modification. If special characters are to be treated differently, fzSFN may be used to specify the size treatment of special characters during font F. For example, fz 3-3 fzS3-0 will cause automatic reduction of font 3 by 3 points while the special characters would not be affected. Any ‘‘.fp’’ request specifying a font on some position must precede ‘*“.f2’" requests relating to that position. New Predefined Number Registers. K Read-onlv. Contains the horizontal size of the text portion (without indent) of the current partially collected output line, if any, in the current environment. Read-only. A number representing the current adjustment mode and type. Can be saved and later given to the "ad” request to restore a previous mode. Read-only. 1 if the current page is being printed, and zero otherwise. Read-only. Contains the current line-spacing parameter ("ls"). General register access to the input line-number in the current input file. same value as the read-only " ¢” register. Contains the A Troff Tutorial 5-83 A TROFF Tutorial Brian W. Kernighan Bell Laboratories Murray Hill, New Jersey 07974 ln accocdance with this philosophy of let- The single most important rule of using troff is not to use it dicectly, but through some intermediary. In many ways, troff resembles an assembly language — a remarkably powerful and flexible one -= but nonetheless such that many operations must be specified at a level of detail and in a form that is too hard for most people to use effectively. For'two special applications, there are pro- grams that provide an interface to troff for the majority of users. eqm (2] provides an easy to learn language for typesetting mathematics: the eqn user need know no troff whatsoever (0 tvpeset mathematics. tbl [3] provides the same convenience for producing tables of arbitrary complexity. For producing straight text (which may well contain mathematics or tables). there area number of ‘macro packages’ that define formatting rules and operations for specific styles of documents, and reduce the amount of direct contact with troff. ln particular, the *=—ms’ (4] and PWB/MM (5] packages for Beil Labs internal memoranda and external papers provide most of the facilities needed for a wide range of document prepacation. (This memo was prepared with "=ms'.) There are also packages for viewgraphs. for simulating the older roff formatters on UNIX and GCOS, and for other special applications. Typically you will find these packages easier to use than troff once you get beyond the most trivial operations; you should always consider them first. In the few cases where existing packages don't do the whole job, the solution s ror (O weite an entirelty new set of troff instructions from scratch. but to make small changes to adapt packages that already exist. although it tries to concentrate on the more use- ful parts. [n any case, there is nd auempt to be complete. Rather. the emphasis is on showing how to do simple things. and how to make incremental changes to what already exists. The contents of the remaining sections are: . Point sizes and line spacing Tabs P ] ment is an example of troff output. described here is only a small part of the whole. O A B ol (1] is a text-formatting program. writJ. E. Ossanna, for producing high-quality by ten printed output from the phototypesetter on the UNIX and GCOS operating systems. This docu- ting someone eise do the work, the part of troff 0 90 1. Intreduction . 10. Fonts and special characters Indents and line length Local motions: Drawing lines and characters Strings Introduction (0 macros Titles. pages and numbering Number registers and arithmetic 11. Macrog with arguments 12. Conditionals 13. 14. Eavironments Diversions Appendix: Typesetter character set The troff described here is the C-language version running on UNIX at Murray Hill, as documented in {1]. To use troff you have to prepare not only the actual text you want printed. but some infor- mation that tells how vou want it printed. (Readers who use .roff will find the approach familiar.) For troff the text and the formatting information are often intertwined quite inti- on 2 mately. Most commands to troff are placed’ line separate {rom the text itself, beginning with a period (one command per line). For example. Some (axt. .ps 14 Some more text. wiil change the poinu size . that is, the size or the letters being printed. 10 "14 point’ (one point is 1/72 inch) like this: 5-84 A Troff Tutorial Some text. OOIME more text. Occasionally, though, something \s=2UNIX\s+2 special occurs in the middle of a line = to produce temporarily decreases the size, whatever it is, by iwo points, changes Area = fi‘fz then have the restores it. advantage Relative size that size the difference is independent of the starting size of the document. The amount of change is restricted to a single digit. you have to type Area = \ (p\fIAfR\[\s8\u2\d\s0 (which we will explain shortly). The backslash character \ is used to inwwoduce troff commands and speciai characters within 2 line of text. the relative The other parameter that determines what the type looks like is the spacing between lines. which is set independently of the point size. Vertical spacing is measured from the bottom of 2. Point Sizes; Line Spacing As mentioned above, one line to the bottom of the next. the command .ps sets the point size. One point is 1/72 inch, so 6-point characters are at most 1/12 inch high, and 36-point characters are s inch. There are 1§ point sizes, listed below. ning text. it is usually best to set the vertical spacing about 20% bigger than the character size. - For example, so far in this document, we have used “*9 on 11", that is, 9 .ps vs llp ¢ powrt: Pack mwy boa wuk Rve dosn liguer jugs. 7 powai: Pack my box with five dozen liquor jugs. 8 point: Pack my box with five dozen liquor jugs. 9 point: Pack my box with five dozen liquor jugs. If we changed to 10 point: Pack my box with five dozen liquor .ps 9 11 point: Pack my box with five dozen 12 point: Pack my box with five dozen 14 point: Pack my box with five 16 point 18 point 20 point 22 24 28 If the number after .ps is not one of these legal sizes, it is rounded up to the next valid value, with a maximum of 36. If no number fol- lows .ps, troff reverts to the previous size, whatever it was., troff begins with point size which i3 usually fine. 10, This document is in 9 poiat. The point size can also be changed in the middle of 2 line or even a word with the in-line command \s. To produce UNIX runs on a POP-11/45 \s8UNIX\s10 runs on a \s8PDP-\sl011/45 As above, \s should be followed by 2 legal point size, except that \sO causes the size to revert to Notice that \s10ll can be understood correctiy as size 10, {ciloweu by au L', if the size is legal. but not otherwise. .vs 9p the running text would look like this. After a [ew lines, you will agree it looks a little cramped. The right vertical spacing is partly a matter of taste, depending on how much lext you want to squeeze into a given space, and pardy 2 matter of traditional printing style. uses 10 on 12. By default, troff Point size and vertical spacing make a substantial difference in the amount of text per square inch. This is 12 on 14, Poim mis and verucal specing maie & subsianual diferencs the smount of tent per squsre mcn. For ezameie. |0 on 13 uses abows rowce a8 Mvech soece 38 T on §. This v & on 7, which 18 sven smalier. It pecis & I8t MOre words per ling. Dyt yOu caa §6 Blng trywng (O read i When used without arguments, .ps and .vs revert (o the previous size and vertical spacing respectively. The command .sp is used to get extra vertUnadorned, it gives you one extra blank line (one :vs, whatever that has been set to). Typically, that's more or less than you ical space. type its previous value. The com- mand to control vertical spacing is .vs. For run- Be cautious with similar constructions. Relative size changes are aiso legal and useful: want, so sp can be followed by about how much space you wan{ - information .$p 2i means 'two inches of vertical space’. .Sp 2p means ‘two points of vertical space’; and SD 2 means ‘two vertical spacss’ = two of whatever A Troff Tutorial \fBbold\iP\fIfac=\[P\IR texi\(P .v8 i$ set to (this can also be made explicit with $p 2v): troff also undersiands decimal fractions in most places. so Because oanly the immediately previous foat is remembered, you have (0 restore the previous .sp 1.5 font after each change or vou can lose it. is 2 space of 1.5 inches. These same scale fac- tors can be used after .vs to define line spacing. argument. There are other fonts available besides the standard set. although you can still use oalv four physical dimensions. It should be noted that all size numbers are converted internally to ‘machine units’, which are [/432 inch (1/6 point). For most pur- at any given time. what fonts worry representation. about the The situation accuracy is of not quite the so good vertically, where resolution is /144 inch (172 pont). J. mounted on the fp3H says that the Helvetica font is mounted on posi- tion 3. (For a complete list of fonts and what they look like, see the troff manual.) Appropriate fp commands should appear at the beginning of troff and the typesetter allow four different fonts at any one ume. Normally three fonts (Times roman, iwlic and bold) and one collection of special characters are permanently mounted. | abedefghijkimnopqrstuvwxyz 0123456789 ABCDEFGHUKLMNOPQRSTUVYWXYZ foats. [t is possible 10 make a2 document celalively independent of the actual (onts used (o print it by using font numbers instead of aames; for exampie, \{3 and .f1"3 mean ‘whatever font is mounted at position J°, and thus work for any setting. Normal settings are roman fonat oa 1. italic on 2, bold oa 3, and special on 4. abedesg hijklmnopqrsuvwxyz 0123456789 ABCDEFGHIIKLMNOPQRSTUYWXYZ There is alsoc 2 way to get ‘synthetic’ bold fonts by oversiriking letters with 3 slight ctfsst. abedefghijkimnopqrstuvwxyz 0123456789 Look at the .bd command in (1]. ABCDEFGHIJKLMNOPQRSTUYWXYZ The greek, mathematical symbols and miscellany of the special font are listed in Appendix A. troffl prints in roman uniess old otherwise. To switch into bold, use the .ft command B Special characters have names beginning with \(. and inserted anywhere. For example. four-character they may be Yo 4 1y = Y is produced by and for italics, \(14 4+ \(12 = \(34 qul In particular, greek letters are all of the form To return t0 roman, use .ft R: to return to the previous font, whatever it was, use either .ft P oc just [t. The ‘underline’ command ul \(e=, where = is an upper or lower case roman letter reminiscent of the greek. Thus to get L(axB) = ca in bare troff we have (o type .ul can be f{ollowed by a count to indicate that more than ode line is to be (talicized. Fonts can aiso be changed within 2 line or word with the in-line command \{: boldface text is produced by \{Bbold\fIface\(R text If you want 0o do this so the previous foat, whatever it was, 1S left undisturbed, insert extra \fP commands. like this: The command .{p tells troff physically your document if you do not use the standard Fonts and Special Characters causes the next input line to print in italics. are typesetter: poses, this is enough resolution that you doa't to The same is true of .ps and .vs when used without an and in fact after most commands that deal with have 5-85 \ (S (\ (ea\ (mu\ («b) \(=> \ (i That line is unscrambled as follows: \ (=S ( \ (-2 \{mu \ (b ) \(=> \Gf z ( e X g ) e o A complete list of these special names occurs in Appendix A. 5-8 A Troff Tutorial In eqn (2] the same effect can be achieved Pater noster qui est in caelis sanctificetur nomen tuum: adveniat regnum tuum; fiat voluntas tua, sicut in caelo, et in terra. ... Amen. with the input SIGMA ( alpha times beta ) = > inf which is less concise, but clearer to the uninitiated. Notice that each four-character name is a single character as {ar as troff is concerned = the ‘translate’ command Notice the use of ‘+' and ‘=" to specify the amount of change. These change the previous setting by the specified amount, rather than just overriding it. them one inch long. 2 \(mi\(em With .in. .l and .po, the previous value is used if no argument is specified. is perfectly clear, meaning Af = To indent a single line, use the ‘temporary that is. to translate = into =, Some translated into The distinction is quite imporiant: Jl +1i makes lines one inch longer: .I1 1i makes characters are others: grave automatically and acute indent’ command .ti. For example, all paragraphs in this memo effectively begin with the command accents (apostrophes) become open and close A1 3 single quotes *TM"; the combination of **..."”" is generally preferable to the double quotes °...°. Simi- Three of what? larly a typed minus sign becomes a2 hyphen -. To print an explicit = sign, use \-. To get a backslash printed, use \e. 4. Indents and Line Lengths trofl starts with a line length of 6.5 inches, too wide for 8%xll paper. To reset the line length, use the .l command, as in 1 6i As with .sp, the actual length can be specified in several ways: inches are probably the most intui- tive, The maximum line length provided by the typesetter is 7.5 inches, by the way. To use the full width, you will have to reset the default physical left margin (“*page offset’”), which is nor- mally slightly less than one inch from the left edge of the paper. This is done by the .po command. .po), is ems; an em is roughly the width of the letter ‘m’ in the current point size. (Precisely, a em in size p is p points.) Although inches are usually clearer than ems (o people who don’'t set type for a living, ems have a place: they are a measure of size that is proportional to the current point size. If you want to make text that keeps its proportions regardless of point size., you should use ems {or all dimensions. Ems can be specified as scale factors directly, as in .ti 2.5m. Lines can also be indented negatively if the indent is already positive: i =030 causes the next line teaths of an inch. The indent command .in causes the left margin 0 be indented by some specified amount from the page offset. [ we use .in to move the left margin in. and .1l t0 move the right margin to the left, we can make offset blocks of text: in Q.34 J =0 3§ text to be set iato a block Al +0.31 in =Q.3i will create a biock that looks like this: moved back three maand: noster i~ sanctificetur sets the offset as far 10 the left as it will go. to be Thus to make a decorative initial capital, we indent the whole paragraph, then move the letter *P° back with a3 .ti comY dter .po 0 The defauit unit for .ti, as for most horizontally orieated commands (Il .in. 4. qui est nomen in caelis tuum; ad- veniat regnum tuum; at volun- ag tua, sicut in caelo, et in terra. .. Amen. Of course, there is ilso some trickery to make the *P° bigger (just a “\s36P\s0"), and to move it down from its normal position (see the section on local motions). 5. Tabs Tabs (the ASCII ‘horizontal tab’ character) can be used (0 produce output in columns. or L0 set the horizontal position of output. Typically tabs are used only in unfilled text. Tab stops are set by default every haif inch from the curreat indent, but can be changed by the .ta comman To set stops every inch. for example. A Troff Tutorial 5-87 a2 1i 21 3t 44 §i 6i Unfortunately the stops are left-justified only (as oa a typewriter), so lining up columns of right-justified numbers can be painful. [ you have many numbers. or if you need more complicated table layout, doa't use troff directly. use the tbl program described in (3]. .af Then change each leading blank into the string \0. This is a character that does not print, but that has the same width as a digit. When 3 60 900 2 50 800 It is also possibie 0 fill up tabbed-over space with some characier other than blanks by setting the ‘tab replacement character’ with the * N\ (eu is °.") Name smaller, bracket it with point size, be sure to put them either both inside Sometimes the space given by \u 2and \d isn't the right amount. The \v command can be used to request an arbitrary amount of vertical motion. The in-line command \v (amount)’ causes motion up or down the page by the .in +0.6i J-0.3% A ~=0.3i (move paragraph in) (shorten lines) (move P back) \v'2\s36P\sO\v = 2"ater noster qui est in caelis .. A minus sign causes upward motion, while no There are many other ways to specifv the ammount of motion — Age the tab replacement character (o a tion 6.) \v'0.11° \v3p’ \v' =0.5m’ and so on are all legal. Notice that the scale specifier { or p or m goes inside the quotes. Aay troff also provides a very general mechanism called ‘fields’ for setting up complicated (This is used by thl). We will not go into it in this papes. 6. ‘2° line spaces. blank, use .ic with no argument. (Lines can also be drawn with the \| command, described in Sec- columns. the sign or a plus sign means down the page. Thus produces ceset make \s=2..\s0. Since \u and \d c=fer to the current \v'=2’ causes an upward vertical motion of two Name wb Age b To To amount specified in ‘(amount)’. For example. to move the ‘P° down, we used printed, this will producs L2 1.5t .58 e\ (eu Area = wr° an unbalanced vertical motion. 8 A command: produces or both outside the size changes, or you wiil ge: Aa li 2 3i 3 2 wmb l wb 0 ws 50 wb 60 700 tab 800 wb 900 | Area = \(=pr\u\d 2 For a handful of numeric columns, you can do it this way: Precede every number by enough blanks to make it line up when (yped. 1 40 700 local motions \u and \d. To go back up the pag= half a point-size, insert a \u at the desired place: to go down, insert a2 \d. (\u and \d should always be used in pairs. as explained below.) Thus. Local Motions: Drawing lines and charac- {ers 3 : Remember ‘Area = 7¢°’ and the big P’ in the Paternoster. How are they doane? troff provides a host of commands for placing characters of any size at any place. You can use them character can be used in place of the quotes: this is also true of all other troff commands described inn this section. Since troff does not take within-the-line vertical motions into account when figuring out where it is on the page. output lines can have unexpected positions 'if the left and right ends aren't at the same vertical position. Thus \v, like \u and \d. should aiways balance upward vertical motion in a4 line with the same amount in the downward direction. to draw special chiaracters or to tune your output Acrbitrary horizontal motions ace also wail. for a particular appearance. Most of these commands are striighttorward, but messy (0 read able = \h is quite analogous to \v. except that the default scale f{actor is 2ms instead of line and tough to type correctly. spaces. [f you won't use eqn, subscripts and super- scTipts are most easily done with the haif-line As an example, \RT=0. 17 5-88 A Troff Tutorial sysiéme télephonique causes a backwards motion of a tenth of an inch. As a practical matter, coasider mathematical symbol *> >°. printing the The default spacing i$ t0o wide, so eqn replaces this by command. \zx suppresses the normal horizontal to produce >>, Frequently \h is used with the ‘width function’ \w to generate motions equal to the width of some character string. The construction \w'thing’ to the width of ‘thing machine units (1/432 inch). tions are ultimately remember that each is just one character to troff. You can make your own overstrikes with another special conveation., \z, the zero-motion >\h'=0.3m"> IS a number equal The accents are \(ga and \(aa, or \" and \’; done in in motion after printing the single character x, so another character can Although sizes can be laid on be changed top of iL within \o, it centers the characters oa the widest, and there can be no horizontal or vertical motions, so \z may be the only way to get what you want: All troff computathese units. To move horizontally the width of aa ‘x’, we can sy is produced by \n\woxoua Sp 2 \s8\z\ (sq\s14\z\ (sq\s22\z\ (sq\s36\ (sq As we mentioned above, the default scale factor for all horizoatal dimensions is m. ems, so hese we must have the u for machine units, or the motion produced will be far oo large. troff is quite happy with the nested quotes, by the way, The .sp is nesded to leave room for the result. As another example, an extra-heavy semicolon that looks like 30 long as you don't leave any out. y instead of ; or | As a live example of this kind of construction. all of the command names in the text, like Sp, were done by oversturiking with a slight offset. The commands for .sp are Sp\h’ =\w".sp'u’\h'lu".sp can be constructed with a big comma and a big period above it \s+80\z\v'=0.25m"\v'0.25m"\s0 *0.25m" is an empirical constant. That is, put out ‘.sp’, move left by the width of *.sp°, move right | unit, and print ‘.sp’ again. (Of course there is a way to avoid typing that much input for each command name, which we will discuss in Section 11.) A more ornate overstrike is given by the bracketing function \b, which piles up characters vertically, ‘Unpaddable’ means the current baseline. <1 There are also several special-purpose troff the same width as a digit. on with piled-up smaller pieces: commands for local motion. We have aiready seen \0, which is an unpaddable white space of centered Thus we can get big brackets, constructing them by typing in only this: that it will never be widened or split across a line P by \bA I\ (Ik\ (16" \BA I\ (I x \B\ (e (e \ b\ (21\ (£k\ (¢d’ line justification and filling. There is also \(blank). which is an unpaddable character the width of a space, \| which is half that width, \°, which is one quanter of the width of 2 space, and \&, which has zero width. (This last one is use- ful, for example, in entering a text line which would otherwise begin with a *.".) The command \o. used like \o’set of characters’ causes (up to 9) characters to be oversiruck, centered on the widest. This is nice for acceats, as in syst\o”e\ (ga"me \o"e\(aa"No"e\ (aa"phonique which makes troff also provides a coavenient facility for drawing horizontal and vertical lines of arbitrary length with arbitrary characters. \I'li’ draws a line one inch long, like this: . The length can be followed by the character to use if the . isa't appropriate; \I'0.5i.' draws a half-inch line of dots: ............... The construce tion \L is entirely analogous, except that it draws a vertical line instead of horizontal. 7. Strings Obviously if a paper contains a large number of occurrences of an acute accent over a letter ‘e’, typing \o"e\TM for each & would be a A Troff Tutorial 5-89 great nuisance. PP Fortunately. troff provides a way in which you can store an arbitrary collection of text in a that would be treated by troff exactly as .$p "string’. and thereafter use the siring name as a shorthand for its contents. A1 +2m Strings are one of several trofl mechanisms whose judicious use lets you type a document with less effort and organize it so that extensive format changes can PP is called a macro. The way we tell troff what PP means is 10 define it with the .de command: .de PP be made with few editing changes. P A ceference (0 a string is replaced by what- ever text the string was defined as. Strings are defined with the command .ds. The line .ds e \o"e\"TM defines the string e to have the value \0"e\TM String names may be either one or two characters long, and are referred 0 by \ex for one character names or \e(xy for two character names. Thus (o get 1iéléphone, given the definition of the string e as above, we can say t\eei\ sephone. The first line names the macro (we used °.PP° for ‘pacagraph’, and upper case so it wouldn't conflict with any name that troff might already know about). The last line .. marks the ead of the definition. [n betwes=n is the text, which is simply inserted whenever troff sees the ‘command’ or macro call PP If a string must begin with blanks, define it A macro can contain any mixture of text and formatting commands. ds xx ° The definition of .PP has to precede its first use; undefined macros are simply ignored. as text The double quote signals the beginning of the definition. There is no trailing quote: the end of the line terminates the string. A sturing may actually be several lines long; if troff encounters a \ at the end of any line, it is throwa away and the next line added to the current one. So you can make a long string simply A +2m by ending each line but the last with a2 backsiash: Names are restricted to one or two characters. Using occurring Not only does it save typing, but it makes later Suppose we decide that the paragraph indent is (0o small. the vertical space is much too big, and roman font should be forced. Instead of changing the whole docu- ment, we need only change the definition of .PP .de PP Sp 2p long string \° parageaph macro Al +3Im LR Strings may be defined in terms of other strings, or even in terms of themselves; we will discuss some of these possibilities later. o and the change takes effect everywhere we used PP, §. Introduction to Macros Before we can go much further in troff, we rreed to learn a bit about the macro [acility. Ia its simplest form, 3 macro is just a shorthand Suppose we want every paragraph to start in exactly the same -- with a space and a temporary indent of Iwo ems: \" is a troff command that causes the rest of the line to be ignored. 4 +2m Then 0 save typing, we would like to collapse these into one shorthand line. a troff ‘command’ We use it here 10 add comments to the macro definition (3 wise idea once definitions get complicated). As another example of macros. consider these two which start and end a block of offset. unfilled text, like most of the examples in this papes: .Sp like commonly changes much easier. is a very \ way for to something like .ds xx this \ notation quite similar to a swring. macros sequences of commands is critically important. 5-90 A Troff Tutorial .de BS \" start indented block 5D Then we space down half an inch, print the title af in +0.3i .de BE issue a ‘begin page' command "Bp. which causes a skip to top-of-page (we'll explain the " shortly). (the use of .t should be self explanatory: later we will discuss parameterizing the titles), space \" end indented block another 0.3 inches, and we're done. To ask {or NP at the bottom of each page. K3 A ia =0.3i we have 10 say something like ‘when the text is within an inch of the bottom of the page, start the processing for a new page.’ This is done with a ‘when’ command .wh: Now we can surround text like .wh Copy to John Doe Richard Roberts Stanley Smith by the commands .BS and BE, and it will come out as it did above. Natice that we indented by ia +0.3i instead of .in 0.3i. This way we can nest our uses of .BS and BE (o get blocks within blocks. (No *.° is used before NP:. this is simply the name of a macro, not a macro call.) The minus sign means ‘measure up from the bottom of the page’, so ‘= 1i" means ‘one inch from the bottom’. The .wh command appears in the input outside the definition of .NP; typically the input would be If later on we decide that the indent shouid be 0.5i, then it is only necessary to change the definitions of .BS and .BE, not the whole paper. .de NP wh =1§ NP 9. Titles, Pages and Numbering This is an area where things get tougher, because nothing is done {or you automatically. Of necessity, some of this section is a cookbook, to be copied literally until you get some experience. Suppose you want a title at the top of each page, saying just —left top =1i NP center 0p right topTM In roff, one can say .he ‘left top’center topright top’ Lo ‘left bottom’center bottom right bottom’ to get headers and (ooters automatically on every page. Alas, this doesn't work in troff, a serious hardship for the novice. Instead you have to do a lot of specification. You have 10 say what the actual tite is (easy); when to print it (easy enough); and what o do at and around the title line (harder). Taking these in reverse order, first we define a macro NP (for ‘new page’) to process titles and the like at the end of one page and the beginning of the aext: .de INE bp 'sp 0.5i U “left top’center top’right top’ ‘sp 0.3i Now what happens? As text is actually being output, trofl keeps track of its vertical posilion on the page, and after a line is printed within one inch {rom the bottom, the .NP macro is activated. (In the jargon. the .wh command sets a map at the specified piace. which s ‘sprung’ when that point is passed.) .NP causes a skip to- the top of the next page (that's what the ‘bp was for), then prints the titde with the appropriate margins. Why Bp and 'sp instead of .bp and .sp? The answer is that .sp and .bp, like several other commands, cause a break 10 take place. That is. all the input text collected but not yet printed is flushed out as soon as possible, and the next input line is guaranteed to start a new line of output. If we had used .sp or .bp in the .NP macro, this would cause a break in the middle of the current line when a new page is lowed by the next input line on a new output line. This is nor what we want. Using ' instead of . for a command tells troff that no break is to take place == the output line currently being filled should nor be forced out before the space or new page. The list of commands that cause a break is short and natural: bp To make sure we're at the top of a page. we output started. The effect would be to print the leftover part of that line at the top of the page, fol- br ce f .af sp .n .U Al cthers cause rnc break, recardless of whether A Troff Tutorial 5-91 you use a . or a2 . If you really ne=d a break, add 10. Number Registers and Arithmetic a .br command at the appropriate place. troff has a facility for doing arithmetic., and for defining and using variables with numeric One other thing to beware of — if you're changing fonts or point sizes a lot, you may find that if you cross 2 page boundary in an unexpected font or size, your titles come out in that size and f(ont instead of what you intended. Furthermore, the leagth of a title is independent of the current line leagth, so titles will come out at the default leagih of 6.5 inches. unless you change it, which is done with the .It command. There are several ways to fix the problems of point sizes and fonts in titles. For the simplest applications, we can change .NP (o set the proper size and (ont for the title, then restore the previous values, like this: | .de NP bp | ‘sp 0.5i JtR \" set title font to roman .ps 10 \" and size to 10 point Jt 6i \’ and length to 6 inches Al “left’center right’ .ps \° revert 10 previous size ftP \" and to previous f{ont values, called aumber registers. Number registers, like strings and macros. can be useful in setting up a document sO it is easy 0 change later. Like strings, number registers have one or two character names. 1They are set by the .ne command, and are referenced anywhere by \nx (one character name) or \a(xy (two character name). There are quite a few pre-defined number registers maintained by troff, among them % (or the current page number; al for the current vertical position on the page: dy., mo and yr for the current day, month and year: and .8 and .[ for the current size and font. (The font is 2 aumber from | to 4.) Aay of these can be used in comsputations like any other register, but some, like .8 and .{, canno¢ be changed with .ar. As an example of the use of number regis- ters, in the =-ms macro package (4], most significant parameters are defined in terms of the values of a handful of number registers. These ‘sp 0.3i [-2 4 This version of NP does nor work if the fields in the .Ul command contain size or font changes. To cope with that requires tmfl's ‘eavironment’ mechanism, which we will discuss in Section 13. To get a footer at the bottom of a page. you can modify NP so it does some processing include the point size [oc text, the vertical spacing, and the line and title lengths. To set the point size and vertical spacing {or the following paragraphs, for example, a user may say nrPsS9 qar VS 11 The paragraph macro .PP is defined (roughly) as follows: before the bp command, or split the job into a (ooter macro invoked at the bottom margin and .de PP a header macro invoked at the wop of the page. These variations are left ag exercises. Qutput page numbers are computed automatically as each page is produced (starting at 1), but no numbers are printed unless you ask for them explicitly. To get page numbers printed, include the character % in the .U line at the position where you want the aumber (0 this page. You can set the page number at any ime with either .bp n, which immediately starts the numbered g, or with page JtR .5p 0.5v 2 +3m \* foat \" half a line and line spacing to whatever vilues are stored in the number registers PS and VS. Why are there two backslashes? centers the page number inside hyphens, as on new page \" reset size \" spacing This is the eternai probiem of how to quote a quote. A e % .7 sets .ps \\n{(PS v$ \\n(VSp This sets the font (0o Romaa and the point size appear. For example 3 And of course they serve for any soct of arithmetic computation. number for the .pa n. which next page but doesn’t cause a skip to the new page. Again. .bp +n sets the page number (0 n more than its current value: .bp means .bp +1. When troff originally reads the macro definition, it peels off one backslash to ses what's coming aext. To ensure that another igs left in the definition when the macro is used, we have to put in two backslashes in the definition. [f only one backslash is used, point size and verucal spacing will be frozen at the ume the macro is defined. not when it is used. Protecting by an extra layer of backslashes 5-92 A Troff Tutorial is only needed for \a. \e, \$ (which we haven't ar 1 7v/2 AP \\n(llu come (0 vet), and \ iself. Things like \s. \f. \h, \v, and so oa do not need an extra backslash, since they are converted by troff to an internal code immediately upon being seeq. Arithmetic expressions can where that a2 number is expected. appear any- As a twivial example, The next step is to define macros that can change from one use to the next according to reiational operators >, >=, £, <= = and '= (not equal), and parentheses. Although the arithmetic we have done so far has been straightforward, more complicated things are somewhat tricky. First, number regishold onily integers. integer troff arithmetic uses division, just like Fortran. Second, in the absence of parentheses, evaluation i3 done left-to-right without any operator precedence Thus (including relational operators). To=d+3/13 becomes *—1°. To make this work, we need two things: first, when we define decrements PS by 2. Expressions can use the arithmetic operators +, =, «, /, % (mod), the truncating 11. Macros with arguments parameters supplied as arguments. e PS \\a(PS =2 ters does just what vou want, so long as you don't forget the 4 on the .l command. the macro, we have to indicate that some parts of it will be provided as arguments when the macro is called. Then when the macro is cailed we have 10 provide actual argumentss (o be plugged into the definition. Let us illustrate by defining a macro .SM that will print its argument (wo points smaller than the surrounding text. That is, the macro call SM TROFF will produce TROFF. The definition of .SM is .de SM Number registers can occur any- where in an expression. and so can scale indica- tors like p, i, m, and so on (but no spaces). Although integer division causes truncation, each number and its, scale indicator is converied to machine units (1/432 inch) before any arithmetic is done, so li/2u evaluates to 0.5i correctly. The scale indicator u often has to appear when you wouldn't expect it = in particular, when arithmetic is being done in 2 context that implies horizoatal or vertical dimensions. For example, \s=2\\Sl\s+2 Within 2 macro definition. the symbol \\Sa refers 10 the ath argument that the macro was called with. Thus \\$1 is the string to be placed in a smaller point size when .SM is called. As a slightly more complicated version, the following definition of .SM permits optional second and third arguments that will be printed in the normal size: .de SM WSA\s=2\S1\s +2\\S2 M T7/2 would seem obvious enough - 3% inches. Sorry. Remember that the defauit units for horizontal parameters like .1l are ems. That's really *T ems / 2 inches’. and when t(ranslated into machine units, it becomes zero. How about Sorry, still no good - the ‘2" is ‘2 ems’. so It/ <" 18 small, although not zero. the macro is SM TROFF ), produces TROFF), while SM TROFF ). ( A T2 oMo paBQ Arguments not provided when called are treated as empty, so You muse use A 7i/ 24 So again. a safe rule is to attach a scale indicator o every number, even constants. For arithmetic done within a2 .nr command, there is no implication of horizontal or vertical dimension, so the default units are ‘units’, and 7i/2 and 71/2u mean the same thing. Thus produces (TROFF). [t is convenient 0 reverse the order of arguments because trailing punciuation is much more common than leading. By the way, the number of arguments that a macro was called with is available 11 number register .3. The following macro .BD is the one used to make the ‘bold roman’ we have beea using for troff command names in text [t combines horizontal motions. width computations. and argument rearrangement A Troff Tutorial 5-93 .de BD with something like VEMNSINCINSIVR =\ w \ ST u+ 1 \\SI\[P\\$2 The \h and \w commands need no ds CT -% - exira backslash, as we discussed above. The \& is there in case the argument begins with a period. Two backslashes are needed with the \\Sa commands, though. to protect one of them when the macro is being defined. example will make this Perhaps a second clearer. Consider a macro called SH which produces section headings rather like those in this paper, with the sec- tions numbered automatically, and the title in bold in a smaller size. The use is to give just the page number betwesa hvpnens (as on the top of this page). but a user could supply private definitions for any of the strings. 12. Conditionals Suppose we want the .SH macro to leave two extra inches of space just before section 1, but nowhere else. The cleanest way to do that is to test inside the .SH macro whether the section number is |. and add some space if it is. The .if command provides the conditional test that we can add just before the heading line is output: A \\n(SH=| _sp 2i SH “Section title ... (If the argument to a macro is to conuin blanks, thea it must be surrounded by double quotes, unlike a string, where only one leading quote is permitted.) Here is the definition of the .SH macro: e SH O .de SH \" initialize section number \" first section oaly The condition after the .if can be any arithmetic or logical expression. If the condition is logically true, or arithmetically greater than zero. the rest of the line is treated as if it were lext — here a2 command. [f the condition is false, or zero or negative, the rest of the line is skipped. It is possibie to do more than one com- 3p 0.3i ft'B a0 SH\\a(SH+1 .ps \\a(PS—1 Wa(SH. \\St -ps \\n(PS .$p 0.3i JtR \” increment number \* decrease PS \" number. title mand if a condition is wrue. Suppose several operations are to be done before section 1. One possibility is (0 define 2 macro Sl and invoke it if we are about to do section | (as determined by an .if). \" restore PS .de S1 -- The section number is kept in number register SH, which is incremented each time just before it is used. (A number register may have the same processing for seczion | - :&e SH if\\n(SH=1 St name as a macro without conflict but a string may not.) We used \\a(SH instead of \n(SH and Wa(PS instead of \a(PS. If we had used \n(SH, we would get the value of the register at the time the macro was defined, not at the time it was used. [f that's what you want, fine, but not here. Similarly, by using \\a(PS. we get the point size at the time the macro is called An alternate way is to use the extended form of the .if, like this: A \\n(SH =1 \{-- processing foe section | —-\} The braces \{ and \} must occur in the positions shown or you will get unexpecied extra lines in troff aiso provides an ‘if-else’ con- your output. As an example that does not invoive aumbers, recall our .NP macro which had a .t “left’center’right’ We could make these into parameters by using insiead A N\(LT\\«(CT\\«(RT so the title comes {rom three strings cailed LT, CT and RT. If these are empty. then the tisle will be a blank line. Normaily CT would be set struction, which we will not go into here. A condition can be negated by precading it with !, we get the same effect as 1bove (but less clearly) by using A N\n(SH>1 S1 There are a handful of other conditions that can be tested with .if. For exampie. is the current page even or odd? 5-94 A Troff Tutorial version shown keeps all the processing in one if e .t “even page title” if 0 .Ul “odd page title” place gives facing pages different titles when used inside an appropriate new page macro. Two other coaditions are t and a. which tell you whether the formatter is troff or nroff. does °‘stufl” if sering! is the same as sring2. The character separating the strings can be anything reasonable that is not contained in either string. The strings themselves can reference strings with \*, arguments with \$, and so on. (0o understand and 14. Diversions There are numerous occasions in page layout when it is necessary to store some text for a Foot- the footnote usually appears in the input well before the piace on the page where it is (0 be printed is reached. In fact, the place where it is implies that there must be a way to process the footnote at least enough to decide its size without printing it. troff provides 3 mechanism called a diversion for doing this processing. Any part of the output may be diverted into 2 macro instead of being printed, and then at some convenient time 13. Eavironments there is a potential problem when going across a page boundary: parameters like size and font for a page title may well be different (rom those in effect in the text when the page boundary occurs. troff provides a very general way to deal with this and similas situations. There are three ‘eavironments’, each of which has indeperidently settable versions of many of the parameters associated with processing, including size, font,-line and title lengths, fill/nofill mode, b stops. and even partiaily collected lines. easier output normally depends on how big it is, which if “stringlstring2” souff mentioned, thus notes are the most obvious example: the text of Finally, string comparisons may be made in an .if° we is period of time without actually printing it. Af t troff stuff ... Aif n aroff stuff ... As and change. Thus the titling problem may be readily solved by processing the main text in one environment and titles in a separate one with its own suitable parameters. The command .ev n shifls to environment n; 8 must be 0, | or 2. The command .ev with noe argument returns (0 the previous egviron- ment. Environment names are maintained in a stack, so calls for different environments may be nested and unwound consistently. Suppose we say that the main text is processed in eavironment 0. which is where troff begins by defauit. Then we can modify the new page macro .NP (o process titles in eavironment l like this: the macro may be put back into the input. The command .di xy begins a diversion = all subsequent output is collected into the macro Xy until the command .di with no arguments is encountered. This terminates the diversion. The processed text is available at any time thereafter, simply by giving the command XY The vertical size of the last finished diversion is contained in the built-in number register dn. As a simple example. suppose we want (0 implement a ‘keep-release’ operation. so that text between the commands .KS and .KE will not be split across a page boundary (as for a figure or table). Clearly, when a KS is encountered, we have to begin diverting the output so we can find out how big it is. Then when a .KE is seen, we decide whether the diverted text will fit on the current page, and print it either there if it fits, or at the top of the next page if it doesn’t. So: .de KS .br ev | A di XX \° start keep \" start fresh line \* collect in new eavironment \" make it filled text \" collect in XX .de NP ev i At 6i Lt R .ps 10 \® shift t0 new eavironment \" set parameters here ... anly other processing ... v \" return (o previous enviroament .de KE \" end keep br \" get last partial line di \" end diver<inn Af\\n(dn> =\\n(.t .bp \" bp if doesn’t tit af \" bring it back in no-{ill XX \" text eV \" return to normal eavironment [t is also possible to initialize the parameters for an eaviroament outside the NP macro, but the Recall that number register nl is the current A Troff Tutorial 5-95 position on the output page. Since output was being diverted. this remains at its value when the diversion started. dn is the amount of text in the diversion; .t (another built-in register) is the distance (o the next trap. which we assume (s at the bottom margin of the page. [f the diversion is large enough (0 go past the trap, the .if is satisfied, and a .bp is issued. [n either case, the divested output is then brought back with .XX. [t is essential to bring it back in no-fill mode so troff will do no further processiag on it. This is not the most general keep-release, nor is it robust in the face of all conceivable inputs, but it would require more space than we have here to write it in {ull generality. This section is not intended to teach everything about diversions., but to sketch out eaough that you can read axisting macro packages with some comprehensiorn. Acknowledgements [ am deeply indebted to J. F. Ossanna. the author of trofl, for his repeated patient explana. tions of fine poiats, and for his continuing willingness to adapt troff to make other uses easier. [ am also grateful to Jim Blinn, Ted Dolotia, Doug Mcliroy, Mike Lesk and Joel Sturman for helpful comments on this paper. References (1] (2] (3] (4] J. F. Ossanna, ~NROFFITROFF User's Manual, Bell Laboratories Computing Science Technical R:port 54, 1976. B. W. Kerighan, 4 System for Tvpesetting Mathematics = User's Guide (Second Edition), Bell Laboratories Computing Science Technical Report 17, 1977. M. E. Lesk, TBL = A Program to Formar Tables. Betl Laboratories Computing Science Technical Repoct 49, 1976. M. E. Lesk, Typing Documents on UNIX, Bell Laboratories, 1978. (5] J. R. Mashey and D. W. Smith, PWB/AM - Programmer's Workbench Wemorandum intemal Laboratories Beil Wacros, memorandurm. 5-96 A Troff Tutorial Appéndix A: Phototypesetter Character Set These characters exist in roman, italic, and bold. To get the one on the left, type the four-character name on the right. ff \(f i ® ® - \(em ° \(de e \(bu \ru \(co \(rg \(f fl i \(Fi ffl \(Fi s \(14 a \(12 t \(dg " \(fm O \(sq - \(hy (In bold, \(sqis ®) \({fl Yo \(34 ¢ \(ct The following are special-font characters: + \(pl - \(mi x \(mu + \(di = #Z \(eq \(lm = \(== \(+- z - \(>= \(no <€ / (<= ~ - f \(ap \G> \Gs = - \(= \(<- \(pd « 1 oo \(pt \(ua \(if vV | v \(gr \(da C \(sb D> \sp U \lcu N \(ca C \(ib \(aa " \(ga O € \(mo \(ci @ \(bs § ([ \(sc \(t t ) w [ \{f w ] \(rh \(rc \(b | \(lh \c | } \(dd \ (rt \(br I \(rk i \(bv s \(ts " { \ (Ik | * + a 2 b \p \(rb \lor _ @ | \ul ~ \(sl \sr \(es \Gf \(rn \(at These four characters also have two-character names. The * is the apostrophe on terminals; the ' is the other quote mark. \ AN - \- -\ These characters exist only on the special font, but they do not have four-character names: ) < > -\ o @ For greek, precede the roman letter by \ (¢ to get the corresponding greek; for example, \(*a is a. abgdezyhiklImncoprstufxqw afBydel{mbikAuvéomporTvodyvow ABGDEZYHIKLMNCOPRSTUFXQW ABTAEZHO I KAMNEOINNPZTY®PXVQ A System for Typesetting Mathematics 5-97 A System for Typesetting Mathematics Brian W. Kernighan and Lorinda L. Cherry Bell Laboratories Murray Hill, New Jersey 07974 ABSTRACT This paper describes the design and implementation of a system for typesetting mathematics. The language has been designed to be easy to learn and to use by people (for example. secretaries and mathematical typists) who know neither mathematics nor typesetting. Experience indicates that the language can be learned in an hour or so. for it has few rules and fewer exceptions. For typical expressions. the size and font changes, the positioning, line drawing, and like mathematical conventions are all done automatically. necessary to print according to For example, the input sum from i=0 to infinity x sub i = pi over 2 produces - T Zx= The syntax of the language is specified by a small context-free grammar; a compiler-compiler is used to make a compiler that translates this language into typesetting commands. Output may be produced on either a phototypesetter or on a terminal with forward and reverse half-line motions. The system interfaces directly with text formatting programs, so mixtures of text and mathematics may be handled simply. This paper is a revision of a paper originally published in CACM, March, 1975. . character of mathematics, which the superscript Introduction ‘*Mathematics difficult, or penalty, is known in copy because the it trade as is slower, and limits in the preceding example showed in its simplest form. This is carried further by more difficult, and more expensive to set in type a0 than any other kind of copy normally occurring + il . in books and journals.” {1] One difficulty with mathematical text is the muitiplicity of characters, sizes, and fonts. ¢ An expression such as b as+ - and still further by lim (tan x)¥" % = | Rl TR ] requires an intimate mixture of roman, italic and greek letters, in three sizes, and a special charac- ter or two. (‘*Requires’ is perhaps the wrong word, but mathematics has its own typographical conventions which are quite different from those of ordinary text.) Typesetting such an expression by traditional methods is still b, an essentially manual operation. A second difficulty is the two dimensional log JaeTM—b Zrnsab r ] dx o J ae'’"s —be =" R 5 ¢ VaeTM +Jb t ey N ey m~ab an "‘ h At ‘We / - ] \/; "~ th~='( vasO * TM) These examples also show line-drawing, built-up characters like braces and radicals. and a spec- trum of positioning problems. (Section 6 shows 5-98 A System for Typesetting Mathematics what a user has to type to produce these on our without limit. system.) Third. ‘‘standard’” things should happen automatically. Someone who types “xmy<z+]" should get ‘“‘xmy+4+:+=]"". Subscripts and superscripts should automatically be printed in an appropriately smaller size, with no special intervention. Fraction bars have to be made the right length and positioned at the right 2. Photocomposition Photocomposition techniques can be used to solve some of the problems of typesetting A phototypesetter 1s a device mathematics. which exposes a piece of photographic paper or film, placing characters wherever they are wanted. The Graphic Systems phototypesetter{2] on the UNIX operating system([3] works by shin- ing light through a character stencil. The character is made the right size by lenses, and the light beam directed by fiber optics to the desired place on a piece of photographic paper. The exposed paper is developed and typically used in some form of photo-offset reproduction. On UNIX, the phototypesetter is driven by height. And so on. Indeed a mechanisrh for overriding default actions has to exist, but its application is the exception, not the rule. - We assume that the typist has a reasonable picture (a two-dimensional representation) of the desired final form., as might be handwritten by the author of a paper. We also assume that the input is typed on a computer terminal much like an ordinary typewriter. This implies an input alphabet of perhaps 100 characters, none of them a formatting program called TROFF [4]. TROFF special. provides all of the facilities that one needs for doing mathernatics, such as arbitrary horizontal and vertical motions, line-drawing, size changing, but the syntax for describing these special operations is difficult to learn, and difficult even for our design was that the system should be easy to implement, since neither of the authors had any desire to make a long-term project of it. Since our design was not firm, it was also necessary that the program be easy to change at any time. For this reason we decided to use TROFF as an ‘‘assembly language,”” by designing a language for describing mathematical expressions, and compiling it into TROFF. To make the program easy to build and to change. and to guarantee regularity (*‘it should work everywhere'’), the language is defined by a context-{ree grammar, described in Section 3. The compiler for the language was built using a was designed for setting running text. [t also experienced users to type correctly. 3. Language Design The fundamental principle upon which we based our language design is that the language should be easy to use by people (for exampie, secretaries) who know neither mathematics nor typesetting. This principle implies several things. First, about conventions mathematical “‘normal’’ operator precedence, parentheses, and the like cannot be used, for to give special meaning to such characters means that the user has to understand what he or she is typing. Thus the language should not assume, [or instance, that parentheses are always balanced, for they are not in the half-open interval (a.,5]. Nor should it assume that that Va+b6 can be replaced by 4 \ e 8 s 5@ = 4 \fi“@"éiéq or that 1/{(1=x) is better written as (or vice versa). Second., there should be relatively few rules, keywords, special symbols and operators, and the like. This keeps the language easy (o learn and remember. Furthermore. there should be few exceptions to the rules that do exist: if something works in one situation, it should work everywhere. [f a variable can have a subscript, then a subscript can have a subscript, and so on A secondary, but still important, goal in compiler-compiler. A priori, the grammar/compiler-compiier approach seemed the right thing to do. Our subsequent experience leads us to believe that uny other course would have been folly. The original language was designed in a few days. Construction of a working system sufficient to try significant examples required perhaps a personmonth. Since then, we have spent a modest amount of additional time over several years tuning, adding facilities, and occasionally changing the language as users make critictsms and suggestions. We also decided quite early that we would let TROFF do our work for us whenever possible. TROFF is quitea powerful program, with a macro facility, text and arithmetic variables, numerical computation and testing, and conditional branching. Thus we have been able to avoid writing a lot of mundane but tricky software. For example, we store no text strings, but simply pass them on to TROFF. Thus we avoid having (o write a storage management package. Further- more, we have been able to isolate ourselves from most details of the particular device and character set currently in use. For example, we let TROFF compute the widths of all strings of A System for Typesetting Mathematics we write characters. we need know nothing about them. A third design goal is special to our environment. Since our program is only useful for typesetting mathematics, it is necessary that it interface cleanly with the underlying typesetting language for the benefit of users who want to set intermingled mathematics and text (the usual case). The standard mode of operation is that when a document is typed, mathematical expres- sions are input as part of the text, but marked by user settable delimiters. The program reads this input and treats as comments those things which are not mathematics, through untouched. simply passing 5-99 them At the same time it con- verts the mathematical input into the necessary TROFF commands. The resulting ioutput is passed directly to TROFF where the comments and the mathematical parts both become text and/or TROFF commands. f(t) = 2 piint sin ( omega t )dt Here spaces are necessarv in the input to indicate that sin, pi, int, and omega are special, and potentially worth special treatment. EQN looks up each such string of characters in a table, and if appropriate gives it a translation. and In this case, pr omega become their greek equivalents, /s becomes the integral sign (which must be moved down and enlarged so it looks ‘‘right’’), and sin is made roman, following practice. Parentheses, mathematical operators are automatically made conventional digits roman and wher- ever found. Fractions are specified with the keyword over: a+boverc+d+e = | produces 4. The Language a+b We will not try to describe the language precisely here; interested readers may refer to the appendix for more details. Throughout this section, we will write expressions exactly as they are handed to the typesetting show the delimiters that the user types to mark the beginning and end of the expression. The - interface between EQN and TROFF is described at the end of this section. , As we said, typing x=y-<+z+1 should produce x=y+z+1, and indeed it does. are made italic, operators and digits Variables become roman, and normal spacings between letters and operators are altered slightly to give a more pleasing appearance. Input is free-form. Spaces and new lines in the input are used by EQN to separate pieces of the input; they are not used to create space in the output. Thus X = Similarly, subscripts and superscripts are introduced by the keywords subd and sup: program (hereinafter called “*EQN"’), except that we won't _ ¢ +d+e X by lumal is produced by Xsup 2 + ysup2 = 2 sup 2 The spaces after the 2's are necessary to mark the end of the superscripts; similarly the keyword sup has to be marked off by spaces or some equivalent delimiter. The return to the proper baseline is automatic. Multiple levels of sub- scripts are or superscripts “xsupysupz" is x*. of course allowed: The construct ‘‘some- thing sub something sup something’TM is recog- nized as a special case, so **x sub i sup 2" is x,? instead of x,2. More complicated expressions can now be formed with these primitives: y 24 2 2 al! b? af xt v +2z+1 also gives x=y<+z+1. Free-form input is easier to type initially; subsequent editing is also easier, dx? is produced by for an expression may be typed as many short (partial sup 2 ] over {partial x sup 2| = lines. X sup 2 overasup 2 + y sup 2 aver b sup 2 Extra white space can be forced into the output by several characters of various sizes. A Braces ] are used to group objects together; in this case they indicate unambiguously what goes tilde ** ~ "' gives a space equal to the normal word over what on the left-hand side of the expres- spacing sion. in text; a circumflex gives half this much, and a tab charcter spaces to the next tab The language defines the precedence of sup to be higher than that of over. so no braces are stop. needed to get the correct association on the right Spaces (or tildes, etc.) also serve to delimit pieces of the input. For example, to get side. S/ (()"wasin(wl)dt Braces can always be used when in doubt about precedence. The braces convention is an example of 5-100 A System for Typesetting Mathematics the power of using a recursive grammar to define the language. It is part of the language that if a we can (ype sign (x) “= =" |aft | rpile {1 above O above —1} “Ipile {if above if above if} construct can appear in some context, then any expression in braces can also occur in that context. There is a sqrt operator for making square roots of the appropriate size: ‘‘sqrt a+b' pro- duces Va +b , and x = [=b 4+-— sqrt{b sup 2 —4ac}} over 2a The construction ‘‘left {"° makes a left brace big enough to enclose the ‘‘rpile {...]'", which is a right-justified pile of ‘‘above ... above ..»° “Ipile’’ makes a left-justified pile. There are also centered piles. Because of the recursive language definition, a pile can contain any number of ele- rm -b +/b~dac ments; any element of a pile can of course con- 2a tain piles. Since large radicais look poor on our typesetter, sqrr is not useful for tall expressions. Limits on summations, integrals and simi- lar constructions are specified with the keywords from and ro. To get 3 ““Ipile {x >0 above x =0 above x <0} Although EQN makes a valiant attempt to use the right sizes and fonts, there are times when the default what is wanted. assumptions are simply not For instance the italic sign in the previous example roman. Slides and transparencies often require would conventionally be in larger characters than normal text. Thus we also x,—0 provide size and font changing commands: ‘‘sjze 12 bold {A"x"="y}" will produce A X = Y. we need only type sum from i=Q toinf x subi -=> 0 Centering and making the £ big enough and the limits smalier are all automatic. The from and ro parts are both optional, and the central part (e.g., the ) can in fact be anything: Size is followed by a number representing a character size in points. (One pointis 1/72 inch; this paper is set in 9 point type.) If necessary, an input string can be quoted in "...", which turns off grammatical significance, and any font or spacing changes that might oth- erwise be done on it. Thus we can say lim from {x => pi /2} (tan~x) = inf lim~ roman "sup” "x subn =0 is to ensure that the supremum doesn’t become a lim (tan x)eoo superscript: L=l Again, the braces indicate just what goes into the lim sup x,=0 [from part. There is a facility for making braces, brackets, parentheses, and vertical bars of the Diacritical marks, long a problem in traditional typesetting, are straightforward: right height, using the keywords /efr and right: left [ x+y over 2a right |"="1 X+C+F+X=Y =2+Z is made by typing makes x dot under + x hat <+ y tilde + X hat + Y dotdot = z+Z bar Xy 2a =1 e There are also facilities for globally chang- A left need not have a corresponding right, as we shall see in the next example. Any characters may follow /eft and right, but generally only various parentheses and bars are meaningful. Big brackets, etc., are often used with another facility, called prles, which make vertical piles of objects. For example, to get ! sign{x) =1 1 if x>0 0 if x=0 ] -1 if <0 ing default sizes and fonts, for example for making viewgraphs or for setting chemical equations. The language allows {or matrices, and for lining up equations at the same horizontal position. Finally, there is a definition facility, so a user can say define name "..." at any time in the document; henceforth, any occurrence of the token ‘‘name’ in an expression will be expanded into whatever was inside the double quotes users tailor the in its definition. language (0 This their lets own A System for Typesetting Mathematics specifications, for it is quite possible to redefine keywords like sup or over. Section 6 shows an example of definitions. text The EQN preprocessor and equations, and reads passes intermixed its output to TROFF. Since TROFF uses lines beginning with a period as control words (e.g., ‘‘.ce’’ means “‘center the next output line'’), EQN uses the sequence ‘“‘.EQ’’ to mark the beginning of an equation and °‘‘.EN’ to mark the end. The “.EQ” and **.EN’’ are passed through to TROFF untouched, so they can also be used by a knowledgeable user to center equations, number them automatically, etc. By default, however, “.EQ’" and *“.EN" are simply ignored by TROFF, so by default equations are printed in-line. “.EQ” and ‘“.EN"’ can be supplemented below. of these are simple ones used only to guarantee that some keyword is recognized early enough in the parsing process. are terminal i.e., in capital latters lower case syntactic symbols categories. are The vertical bar | indicates an alternative, the brack- ets [ ] indicate optional material. string of non-blank characters A TEXT is a or any string inside double quotes; the other terminal symbols represent literal occurrences of the corresponding keyword. eqn : box | eqn box box : text | { eqn ] | box OVER box | SQRT box | box SUB box | box SUP box | (L| C| R JPILE { list } .ce | LEFT text eqn [ RIGHT text ] | box [ FROM box ] [ TO box ] | SIZE text box .EQ Xxsubi=ysubi.. .EN is tedious to type ‘““.EQ’" two characters to serve as the left and right del- of expressions. recognized anywhere in These | [ROMAN| BOLD| ITALIC] box | box [HAT| BAR| DOT| DOTDOT| TILDE] and ““ EN’' around very short expressions (single letters, for instance), the user can also define characters are subsequent text. For example if the left and right delimiters have both been set to **#”°, the input: Let #x sub i#, #y# and #alpha# be positive produces: | DEFINE text text list :eqn] list ABOVE egn text : TEXT The grammar makes it obvious why there For example, the observa- are few exceptions. tion that something can be replaced by a more complicated something in braces is implicit in the productions: Running a preprocessor is strikingly easy on UNIX. To typeset text stored in file *‘f"’", one issues the command: | troff The vertical bar connects the output of one process (EQN) to the input of another (TROFF). §. Language Theory The basic structure of the language is not a particularly original one. Equations are pictured as a set of ‘‘boxes,”’ pieced together in various ways. For example, something with a subscript is just a box followed by another box moved downward and shrunk by an appropriate amount. A fraction is just a box centered above another box, at the right altitude, with a line of correct length drawn between them. The grammar for the language is shown | eqn : box | eqn box box :text| { eqn | Let x,, v and a be positive eqn Svmbols symbols; non-terminals, the input: imiters In the original gram- mar, there are about 70 productions, but many by TROFF commands as desired; for example, a it For purposes of exposition, we have col- lapsed some productions. centered display equation can be produced with Since 5-101 Anywhere a single character could be used, any legal construction can be used. Clearly, our grammar is highly ambiguous. What, for instance, do we do with the input a over boverc ? Is it {a over b} over ¢ or is it a over {b overc} ? To answer questions like this, the grammar is supplemented with a small set of rules that describe operators. the precedence and associativity of [n particular, we specify (more or less arbitrarily) that over associates to the left, so the first alternative above is the one chosen. other hand, subd and sup bind to the On the right, 5-102 A System for Typesetting Mathematics Width of output box = because this is closer to standard mathematical . practice. . That is, we assume x“ - o) slightly more than lurgest input width is x'“ ', not Height of output box = (x9)°. slightly more than sum of input heights Base of output box = The precedence rules resolve the ambiguity in a construction like slightly more than height of bottom input box asup 2overb String describing output box = move down;, We define sup to have a higher prccede?ce than move right enough to center bottom box; over, so this construction is parsed as --5- instead draw bottom box (i.e., copy string for bottom box): move up; move left enough to center top box; r of a® draw top box (i.e.. copy string for top box); Naturally, a user can always force a particular parsing by placing braces around expres- move down and left; draw line full width; return to proper base line. sions. Most of the other productions have equally sim- The ambiguous grammar approach seems to be quite useful. The grammar we use is small ple semantic actions. enough to be easily understood, for it contains none of the productions that would be normally used for resolving ambiguity. Instead the supplemental information about precedence and associativity (also small enough to be understood) provides the compiler-compiler with the information it needs to make a fast. deterministic parser for the specific language we want. When the language is supplemented by the disambiguating rules, it is in fact LR(1) and thus easy to Picturing the output as a set of properly placed boxes makes the right sequence of positioning commands quite obvious. The main difficulty is in finding the right numbers to use for esthetically pleasing positioning. With a grammar, it is usuallv clear how to extend the language. users suggesied a For instance, one of our TENSOR operator, to make constructions like | i parse(S]. The output code is generated as the input is scanned. Any time a production of the grammar is recognized, (potentially) some TROFF commands are output. For example, when the lexical analyzer reports that it has found a TEXT (i.e., a string of contiguous characters), we have recognized the production: text : TEXT The translation of this is simple. We generate a local name for the string, then hand the name and the string to TROFF, and let TROFF perform the storage management. All we save is the name of the string, its height, and its baseline. As another example, the translation associated with the production box :box OVER box Grammatically, this is easy: it is sufficient to add a production like box : TENSOR { list | Semantically, we need only juggle the boxes to the right places. 6. | Experience There are really interest—how well three aspects of EQN sets mathematics, how well it satisfies its goal of being '‘easy to use.” and how easyv it was (o build. The first question is easily addressed. This entire paper has been set by the program. Readers can judge for themselves whether it is good enough for their purposes. One of our users commented that although the output is not as good as the best hand-set material, it i1s stll better than average, and much better than the worst. In any case. who cares? Printed books cannot compete with the birds and flowers of illuminated manuscripts on either, they some but have esthetic clear grounds, economic advantages. Some of the deficiencies in the output could be cleaned up with more work on our part. For example, we sometimes leave too much space between a roman letter and an italic one. If we were willing to keep track of the fonts involved, we could do this better more of the A System for Typesetting Mathematics 5-103 asub 0 + bsub | over time. {asubl + bsub 2 over {a sub 2 + b sub 3 over {asub3 + ... }}] Some other weaknesses are inherent in our output device. It is hard, for instance, to draw a line of an arbitrary length without getting a perceptible overstrike at one end. This is the input for the large integral of Section As to ease of use, at the time of writing, the system has been used by two distinct groups. One user population consists of mathematicians, chemists, physicists, and computer scientists. Their typical reaction has been something like: (1) 1It’s easy to write, although [ make the following mistakes... (2) How do Ido...? (3) It botches the following things.... don’t you fix them? (4) You really need the following features... Why 1, notice the use of definitions: define emx “{e sup mx|" define mab "{m sqrt ab}” define sa "{sqrt a}” define sb "{sqrt b}" int dx over {a emx — be sup —mx| "=~ left { Ipile { 1 over {2 mab) “log~ [sa emx — sb} over {sa emx + sb] above 1 over mab ~ tanh sup —1 ( sa over sb emx ) above -] over mab "~ coth sup =1 ( sa over sb emx ) The learning time is short. A few minutes gives the general flavor, and typing a page or two of a paper generally uncovers most of the misconceptions about how it works. The second user group is much larger, the secretaries and mathematical typists who were the original target of the system. They tend to be enthusiastic converts. They find the language As to ease of construction, we have already mentioned that there are really only a few person-months invested. Much of this time has gone into two things-fine-tuning (what is the most between esthetically pleasing space to use the numerator and denominator of a easy to learn (most are largely self-taught), and fraction?), and changing things found deficient have little trouble producing the output they want. They are of course less critical of the esthetics of their output than users trained in mathematics. After a transition period, most find using a computer more interesting than a small, essentially unconnected modules for code regular (ypewriter. miscellany associated with The main difficulty that users have seems to be remembering that a blank is a delimiter; even experienced users use blanks where they shouldn’t and omit them when they are needed. A common instance is typing by our users (shouldn't a tilde be a delimiter?). The macro facility. instead of number of input files and the The program is now about 1600 lines of C (6], a high-level language reminiscent About 20 percent of these lines are of BCPL. “print’’ statements, generating the output code. The semantic routines that generate the TROFF devices. S (x,) of a parser which we did not have to write, and some accommodate which produces consists generation, a simple lexical analyzer, a canned actual f(x sub i) program commands other can formatting be changed languages to and For example, in less than 24 hours, one of us changed the entire semantic package to drive NROFF, a variant of TROFF, for typesetting mathematics on teletypewriter devices capable of f{x) reverse line motions. Since many potential users do not have access to a typesetter, but still have Since the EQN language knows no mathermatics, it cannot deduce that the right parenthesis is not part of the subscript. to type mathematics, this provides a way to get a The language is somewhat prolix, but this doesn’t seem excessive considering how much is even for ultimate use. typed version of the final output which is close enough for debugging purposes, and sometimes being done, and it is certainly more compact than 7. the corresponding TROFF commands. For example, here is the source for the continued fraction Conclusions to do acceptably good typesetting of mathematics expression in Section 1 of this paper: on a phototypesetter, with an input language that We think we have shown that it is possible iIs easy to learn and use and that satisfies many users’ demands. Such a package can be imple- mented in short order, given a compiler-compiler 5-104 A System for Typesetting Mathematics a decent typesetting program underneath. and Defining a language, and building a compiler for it with a compiler-compiler seems like the only sensible way to do business. OQur experience with the use of a grammar and a compiler-compiler has been uniformly favorable. If we had written evervthing into code directly, we would design. have been locked into our original Furthermore, we would have never been sure where the exceptions and special cases were. But because we have a grammar, we can change our minds readily and .still be reasonably sure that if a construction works in one place it will work everywhere. Acknowledgements the We are deeply indebted to J. F. Ossanna. author of TROFF, for his willingness to modify TROFF to make our task easier and for his continuous assistance during the development of our program. We are also grateful to A. V. Aho for help with language theory, to S. C. Johnson for aid with the compiler-compiler, and to our early users A. V. Aho, S. [. Feldman, S. C. Johnson, R. W. Hamming, and M. D. Mclliroy for their constructive criticisms. References [11 (2] A Manual of Siyle. 12th Edition. University of Chicago Press, 1969. p 295. Mode! CIAIT Phorotvpeserter. Graphic Sys- tems, Inc.. Hudson, N. H. (3] Ritchie, D. M., and Thompson, K. L., ““The UNIX time-sharing system.”” Comm. ACM 17, 7 (July 1974), 365-373. (4] Ossafina. J. F., TROFF User's Manual. {5] Aho, A. V., and Johnson, S. C., “LR Parsing.”” Comp. Surv. 6, 2 (June 1974), Bell Laboratories Computing Technical Report 54, 1977. Science 99-124. (6] B. W. Kernighan and D. M. Ritchie, The C Programming Inc., 1978. Language. Prentice-Hall, Typesetting Mathematics — User’s Guide 5-105 (Second Edition) Typesetting Mathematics — User’s Guide Brian W. Kernighan and Lorinda L. Cherry Bell Laboratories Murray Hill, New Jersey 07974 EQ 1. Introduction EQN is a program for X=y+2 typesetting .EN mathematics on the Graphics Systems pho- totypesetters on UNIX and GCoOS. The EQN language was designed to be easy to use by your output will look like Xy === people who know neither mathematics nor typesetting. Thus EQN knows relatively little about mathematics. In particular, The mathematical by EQN. This means that you have to take care of things like centering, numbering, and so on yourself. The most common way parentheses, meanings. symbols like <+, -, X, so on have no special and EQN is quite happy o set garbage (but it will look good). .EQ and .EN are copied through untouched: they are not otherwise processed i$ 10 use the TROFF and NROFF macro pack- EQN works as a preprocessor for the typesetter formatter, TROFF{l], so the nor- | “age package ‘=ms’ developed by M. E. Lesk(3]. which allows you to center, indent, mal mode of operation is (0 prepare a docu- left-justify and number equations. ment with both mathematics and ordinary With the ‘=ms’' package, equations are centered by default. To left-justify an equa- lext interspersed, and let EQN set the mathematics while TROFF does the body of the text. mathematics on DASI and GS! terminals and To indent it. use .EQIL. Any of these can be followed by an arbitrary ‘equation number’ which will be placed at the right margin. For example, on Modei 37 teletypes. The input is identi- the input On UNIX, EQN will also produce tion, use .EQL instead of EQ. cal, but you have to use the programs NEQN and NROFF instead of EQN and TROFF. Of course, some things won't look as good because terminals don't provide the variety of characters, sizes and fonts that a typesetter does, but the output is usually adequate for proofreading. To use EQN on UNIX, eqn files | troff EQI (3.1a) x = f(y/2) + v/2 .EN produces the output x=f(y/2)+y/2 (3.1a) There is also a shorthand notation so in-line expressions like =° can be entered without .EQ and .EN. We will talk about it in GCOS use is discussed in section 26. section 19. 2. Displayed Equations 3. To tell EQN where a mathematical expression begins and ends, we mark it with lines beginning .EQ and .EN. Thus if you sion are thrown away by EQN. type the lines and EN, Input spaces Spaces and newlines within an expres- (Normal text is left absolutely alone.) Thus between X=y+2 EQ 5-106 Typesetting Mathematics — User’s Guide A complete list of EQN names appears and in section 23. X =y + 2 use Knowledgeable users can also TROFF four-character names for any- thing EQN doesn’t know about, like \ (bs for and X = + 2 the Bell System sign @. ¥ 6. Spaces, Again and so on all produce the same output X==yt2 You should use spaces and newlines freely to make your input equations readable and easy to edit. In particular, very long lines are a bad idea, since they are often hard to fix if you make a mistake. The only way EQN can deduce that some sequence of letters might be special is if that sequence is separated from the letters on either side of it. surrounding To force extra spaces into the ourpui, use a tilde ‘"’ for each space you want: word by ordinary the previous section. You can also make special words stand by surrounding them with tildes or circumflexes: x~="2"pi"int"sin" (Comega~tTM) "dt is X ="y +z This can be done by special spaces (or tabs or newlines), as we did in out 4. Output spaces a much the same as the last example, except that the tildes not only separate the gives magic words like sin, omega, and so on, but also add extra spaces, one space per tilde: Xm=my 4+ 2 You can also use a circumflex ‘TM", which gives a space half the width of a tilde. It is mainly useful for fine-tuning. Tabs may also be used to position pieces of an expression, but the tab stops must be set by TROFF commands. x-2‘:rfsin(wt)dl Special words can also be separated by braces { | and double quotes "...", which have special meanings that we will see soon. 7. Subscripts and Superscripts S. Symbols, Special Names, Greek Subscripts EQN knows some mathematical sym- bols, some mathematical names, Greek alphabet. For example, and the x=2 pi int sin ( omega t)dt produces x=2 f sin(w?) dr Here the spaces in the input are necessary to tell EQN that int, pi, sin and omega are separate entities that should get special treatment. The sin, digit 2, and parentheses are set in roman type instead of italic; piand omega are made Greek; and int becomes the integral sign. When in doubt, leave spaces around separate parts of the input. A very common error is to type fIpi) without leaving spaces on both sides of the pi. As a result, EQN does not recognize pi as a special word, and it appears as f(pi) instead of f (). and superscripts are obtained with the words sub and sup. Xsup 2 +ysubk gives x"'+y,‘ EQN takes care of all the size changes and vertical motions needed to make the output look right. The words sub and sup must be surrounded by spaces; x subdl will give you xsub?2 instead of x, Furthermore, don’t forget to leave a space (or a tilde, etc.) to mark the end of a subscript or superscript. A common error is to say something like y = (x sup 2)+1 which causes y,_(x2)+l instead of the intended y={x?)+1 Typesetting Mathematics — User’s Guide Subscripted subscripts and 5-107 super- scripted superscripts also work: The general rule is that anywhere you could X sub i sub ! use some single thing like x, you can use an arbitrarily complicated thing if you enclose it in braces. EQN will look after all the details of positioning it and making it the right size. X,! A subscript and superscript on the same thing are printed one above the other if the subscript comes first: In all cases, make sure you have the right number of braces. Leaving one out or adding an extra will cause EQN to complain bitterly. X sub i sup 2 Occasionally braces. is quotes, like "(". X,-Z group to the right, so xsupy sub:z 9. to print Quoting is discussed in To make a fraction, use the word over: a+b over 2c =| Braces for Grouping Normally, the end of a subscript or superscript is marked simply by a blank (or tab or tilde, etc.) What if the subscript or superscrint is something that has to be typed with blanks in it? In that case, you can use the braces { and } to mark the beginning and end of the subscript or superscript: gives a+b 2¢ = | The line is made the right length and positioned automatically. Braces can be used to make clear what goes over what; {alpha + beta) over {sin (x)} e sup {i omega t} atB eiu( Rule: have Fractions means x’%, not x’,. 8. will more detail in section 14. Other than this special case, sub and sup you To do this, enclose them in double sin(x) Braces can al/ways be used to force EQN to treat something as a unit, or just to make your intent perfectly clear. Thus: What happens when there is both an over and a sup in the same expression? In such an apparently ambiguous case, EQN does the sup before the over, so x sub {i sub 1} sup 2 -b sup 2 over pi Y) P is --f—- instead of —4TM The rules which decide which operation is done first in cases like with braces, but this When x sub i sub | sup 2 in are summarized doubt, however, in section use 23. braces to make clear what goes with what. is 10. Square Roots Xl.‘z To draw a square root, use sgr: which is rather different. Braces necessary: can occur within e sup (i pi sup {rho +1}] braces if | sqrt a+b + 1 over sqgrt {ax sup 2 +bx +c) va+b+ 1 Vax+bx+c 5-108 Typesetting Mathematics — User’s Guide Warning - square roots of tall quantities look lousy, because a root-sign big enough to cover the quantity is too dark and heavy: 12. Size and Font Changes By default, equations are set in 10- point type (the same size as this guide), with standard mathematical conventions to sqrt {a sup 2 over b sub 2} determine what characters are in roman and what in italic. Although EQN makes a vali- ant attempt to use esthetically pleasing sizes and fonts, it is not perfect. To change sizes and fonts, use size n and roman, italic, bold a‘ bs Big square roots are generally better written as something to the power 2. and far Like sub and sup, size and font changes affect only the thing that follows (az/bz) A them, and revert to the normal situation at the end of it. Thus which is bold x y (a sup 2 /b sub 2 ) sup half IS Xy 11. Summation, Integral, Etc. Summations, integrals, constructions are easy: and similar and size 14 bold x = y + size 14 {alpha + beta} sum from i=0 to {i= inf} x sup i gives produces X=y+a+f [l -] 2x j={) Notice that we used braces to indicate where the upper part i=co begins and ends. No braces were necessary for the lower part i=() because it contained no blanks. The braces will never hurt, and if the from and (o parts contain any blanks, braces around them. you must use The from and (o parts are both optional, but if both are used, they have to occur in that order. Other useful characters can replace the sum in our example: int inter I size 12 { ... } Legal sizes which may follow size are 6, 7,8, 9,10, 11, 12, 14, 16, 18, 20, 22, 24, 28, 36. You can also change the size by a given amount; for example, you can say size +2 to make the size two points bigger, or size—J to make it three points smaller. This has the advantage that you don’t have If you are using fonts other than roman, italic and bold, you can say fomt X become, respectively, f the size of an entire equation by to know what the current size is. union prod As always, you can use braces if you want to affect something more complicated than a single letter. For example, you can change U n Since the thing before the from can be anything, even something in braces, from-to can often be used in unexpected ways: lim from {n => inf} x subn =0 where X is a one character TROFF name or number for the font. Since EQN is tuned for roman, italic and bold, other fonts may not give quite as good an appearance. The far operation takes the current font and widens it by overstriking: far grad is V and fat{x sub } is x,. If an entire document is to be in a lim x,=0 A8 non-standard size or font, it is a severe nuisance to have to write out a size and font change for each equation. Accordingly, vou can set a ‘‘global” size or font which Typesetting Mathematics — User’s Guide 5-109 therearter affects all equations. At italic "sin(x)" + sin (x) the beginning of any equation, you might say, for instance, sin(x) +sin(x) .EQ gsize 16 Quotes are also used to get braces and gfont R other EQN keywords printed: "{ size alpha }" .EN to set the size to 16 and the font to roman thereafter. In place of R, you can use any of the TROFF font names. The size after gsize can be a relative change with + or —. Generally, gsize and gfont will appear at the beginning of a document but they can also appear thoughout a document: the glo- IS { size alpha | and roman "{ size alpha }" IS [ size alpha } bal font and size can be changed as often as needed. For example, in a footnote} you will typically want the size of equations to match the size of the footnote text, which is two points smaller than the main text. Don’t forget to reset the global size at the end of the footnote. 13. The construction "" is often used as a L4 place-holder when grammatically EQN needs something, but you don’t actually want anything in your output. For example, to make e, you can't just type sup 2 roman He because a sup has to be a superscript on something. Thus you must say Diacritical Marks To get funny marks on top of letters, " sup 2 roman He there are several words: To get a literal quote use ‘*\"”* TROFF x dot x dotdot x X characters like \(bs can appear unquoted, X hat X X tilde X X vec x dyad X bar X X X and vertical motions with \ # and \v should always be quoted. (If you’ve never heard of X under X but more complicated things like horizontal \ A and \v, ignore this section.) 15. Sometimes it’s necessary to line up a The diacritical mark is placed at the right height. The bar and under are made the right length for the entire construct, as in x+y+z; other marks are centered. input series of equations at some horizontal position, often at an equals sign. This is done with two operations called mark and lineup. The word mark may appear once at any place in an equation. 14. Quoted Text Any Lining Up Equations entirely within quotes ("...") is not subject to any of the font changes and spacing adjustments normally It remembers the horizontal position where it appeared. Suc- cessive one equations can occurrence of the word contain lineup. The place where /ineup appears is made to line up with done by the equation setter. This provides a the place marked by the previous mark if at way to do your own spacing and adjusting if all possible. Thus, for example, you can say needed: tLike this one, in which we have a {ew random expressions like x, and =%, The sizes for these were set by the command gsize — 2. 5-110 Typesetting Mathematics — User’s Guide EQI three, X+y mark = z .EN parentheses etc. Second, big left and right look poor, because the character set is poorly designed. EQI X lineup == 1 .EN something’’ need not have a corresponding “‘right something’. If the righr part is omit- often The right part may be omitted: a ‘‘left ted, put braces around the thing you want to produce the left bracket to encompass. x+ym=mz Otherwise, the resulting brackets may be too large. If you want to omit the /eft part, things x=] For reasogs too complicated to talk about, when you use EQN and ‘=—ms’, use either EQI or EQL. mark and lineup don’t work with centered equations. are more because technically left TM ... right ) Also bear in mind that mark doesn’t look ahead; complicated, you can't have a righr without a corresponding left. Instead you have to say for example. X mark =1 The lefr "TM means a *‘left nothThis satisfies the rules without hurt- ing’’. ing your output. x+y lineup =2z isn’t going to work, because there isn’t room for the x-+y part after the mark remembers where the xis. 17. Piles There is a general facility for making vertical piles of things; it comes in several flavors. For example: 16. Big Brackets, Etc. A "=" left | pile | a above b above ¢ } To get big brackets [, braces {}, parentheses (), and bars || around things, = pile { x above y above z | right ] use the lefrand right commands: left { a over b + 1 right | will make "= left ( ¢ over d right ) + left { e right | a A = by c The elements of the pile (there can be as many as you want) are centered one above s a a4 b another, at the right height for most pur- The resulting brackets are made big enough to cover whatever they enclose. 2 Other char- acters can be used besides these, but the are not likely to look very good. One exception is the floor and ceiling characters: left floor x over y right floor < = |eft ceiling a over b right ceiling poses. The keyword above is used the entire list. The elements of a pile can be as complicated as needed, even contain- ing more piles. Three other forms of pile exist: Ipile makes a pile with the elements left-justified; rpile makes a right-justified pile; and cpile makes a centered pile, just like pile. produces to separate the pieces; braces are used around The vertical spacing between the pieces is some- what larger for /-, r- and cpiles than it is for ordinary piles. Several warnings about brackets are in order. First, braces are typically bigger than brackets and parentheses, because they are made up of three, five, seven, etc., pieces, while brackets can be made up of two, roman sign (x)” =" left { Ipile {1 above 0 above ~1} " Ipile {ifF’ x>0 above if x=0 above i x <0} Typesetting Mathematics — User’s Guide 5-111 EQN provides a shorthand for short in- makes line expressions. { if x>0 1 if x=0 sign(x) = {0 -1 Notice the in-line equation, and then type expressions right in the middle of text lines. if x<0 left brace without a matching signs, for example, add to the beginning of your document the three lines EQ Matrices delim SS It is also possible to make matrices. For example, to make a neat array like Having done this, you can then say things x, x? like y, y? Let Salpha sub iS be the primary variable, and let SbetaS be zero. you have to type Then we can show that 3x sub 1S is S> =(8§, matrix { ccol { x sub i above y sub i | ccol { x sup 2 above y sup 2 | This works as vou might expect — spaces, This produces a matrix with two centered columns. The elements of the columns are then listed just as for a pile, each element separated by the word above. You can also use Icol or rcol to left or right adjust columns. Each column can be separately adjusted, and there can be as many columns as you like. newlines, and so on are significant in the text, but not in the equation part itself. Multiple equations can occur in a single input line. Enough room is left before and after a line of two adjacent piles, by the way, is that if the elements of the piles don’t all have the same height, they won’t line up properly. A matrix forces them to line up, because it looks at the entire structure before deciding what spacing to use. A word of warning about matrices — each column must have the same number of elements in it. The world will end if you get this wrong. " In a mathematical document, it is necessary to follow mathematical conventions not just in display equations, but also in the body of the text, for example by making variable names like x italic. Although be done by surrounding the appropriate parts with EQ and EN, the con- tinual repetition of .EQ and .EN is a nuisance. Furthermore, with ‘=ms’, .EQ and EN imply a dispiayed equation. in-line expressions that something like ) x, does not interfere with oy the lines surrounding it. To turn off the delimiters, EQ delim off .EN Warning: don’t use braces, tildes, circumflexes, or double quotes as delimiters - chaos will result. 20. Definitions a freguently-used EQN provides a facility so you can give name, 19. Shorthand for In-line Equations could that contains ” The reason for using a matrix instead this To set both the left and right characters to dollar right one. 18. You can define two char- acters to mark the left and right ends of an and string thereafter just instead of the whole string. of characters type the a name For example, if the sequence xsubisubl + ysubisubl appears repeatedly throughout a paper, you can save re-typing it each time by defining it like this: define xy 'xsubisub! -+ ysubisub l This makes xv a shorthand for whatever characters occur between the single quotes in the definition. You can use any character 5-112 Typesetting Mathematics — User’s Guide instead of quote to mark the ends of the definition, so long as it doesn’t appear inside the definition. Now you can use xy like this: You can also say back »n and fwd n to move small amounts horizontally n is how far to move in 1/100’s of an em (an em is about the width of the letter ‘m’.) EQ Thus back 50 moves back about half the f(x) = xy ... width of an .EN m. Similarly vou can move things up or down with up nand down n. and so on. Each occurrence of xy will expand into what it was defined as. Be careful to leave spaces or their equivalent around the name when you actually use it, so EQN will be able to identify it as special. There are several things to waich out for. horizontal spaces can be obtained with tilde and circumtflex. First, although definitions can use pre- vious definitions, as in As with sub or sup, the local motions affect the next thing in the input, and this can be something arbitrarily complicated if it is enclosed in braces. 22. A Large Example Here is the complete source for the three display equations in the abstract of this guide. .EQ define xi "xsubi’ define xil "xisub 1’ EQI G(z)"mark =" esup{In~G)} “=m” exp left ( .EN don'’t define something in terms of itselfs A favorite error is to say sum from k> =1 (S sub k z sup k} over k right ) “=" prod from k> =1 e sup {S sub k z sup k /k} .EN 1 EQ lineup = left define X 'roman X' This is a guaranteed disaster, since X is now defined in terms of itself. If you say define X 'roman "X"' (1 + Ssublz + {Ssublsup2zsup2)over ! + .. right) left (1+ {Ssub2zsup2]over?2 + {Ssub2sup2zsup4) over{ 2sup2cdot?2]} + ... right) .. .EN however, the quotes protect the second X, and everything works fine. EQN keywords can be redefined. You can make / mean over by saying EQ1 lineup = sum from m> =( left ( sum from pile { ksubl ksub2...ksubm >=0 above ksubl +2k sub2 + ... +mk sub m =m] {Ssublsuplksubl)]over {lsupksublksubl!]" {Ssub2sup ksub2))over{2supksub2ksub2!'}" define / 'over’ or redefine over as / with {S submsup (ksubm) ) over (msupk submk subm !} right ) z sup m define over '/’ .EN If you need different things to print on a terminal and on the typesetter, it is some- times worth defining a symbol differently in NEQN and EQN. This can be done with ndefine and tdefine. A definition made with ndefine only takes effect if you are running NEQN: if you use rdefine, the definition only applies for EQN. Names defined with plain define apply to both EQN and NEQN. 21. Keywords, Precedences, Etc. If you don’t use braces, EQN will do operations in the order shown in this list. dyad vec under bar tilde hat dot dotdot fwd back down up far roman italic bold size sub sup sqrr over from 1o Local Motions Although EQN tries to get most things at the right place on the paper, it isn’t perfect, and occasionally you will need to tune the output to make it just right. 23. Small extra These operations group to the left: over sqrt left right All others group to the right. Typesetting Mathematics — User’s Guide Digits, parentheses, brackets, punctua- beta 3 tau -+ tion marks, and these mathematical words chi X theta A are converted to Roman font when encoun- delta o upsilon v tered: epsilon € X1 3 eta 7 zeta ¢ gamma v sin cos tan sinh cosh tanh arc max min lim log In exp Re Im and if for det These character sequences are recognized and translated as shown. < o | 4 - P < - <KL << > > >> inf partial C 0 'A2 half prime approx = nothing cdot - qqX limes del grad M geoeq sum int SCE prod union inter them out in whatever case you want: A above 17, 18 Ipile 17 back bar 21 13 mark matrix 15 18 bold ccol 12 18 ndefine over 20 9 col 18 pile 17 cpile 17 rcol 18 16 define 20 right delim 19 roman 12 dot 13 rpiie 17 dotdot 13 size 12 down 21 sqrt dyad fat 13 12 sub sup 10 7 7 font 12 tdefine 20 from 11 tilde 13 fwd gfont 21 12 to under 13 gsize hat 12 13 up vec 21 13 11 italic 12 - 4 6 Icol 18 (] 8 left lineup 16 15 " g8, 14 24. Troubleshooting To obtain Greek letters, simply spell DELTA These are all the words known to EQN (except for characters with names), together with the section where they are discussed. 1 HHYHMAWV > £ ] 5-113 iota L GAMMA T LAMBDA A OMEGA PHI b PI I1 PSI R4 SIGMA X kappa lambda mu nu omega omicron phi K A i v w o ® THETA @ UPSILON Y pi pSi T W X1 = rho p alpha a sigma o If you make a mistake in an equation, like leaving out a brace (very common) or having one too many (very common) or having a sup with nothing before it (common), EQN will tell you with the message syntax error between lines x and y, file = where x and y are approximately the lines between which the trouble occurred, and z1s the name of the file in question. The line numbers are approximate — look nearby as well. There are also self-explanatory mes- sages that arise if you leave out a quote or try to run EQN on a non-existent file. If you want to check a document before actually printing it (on UNIX only), 5-114 Typesetting Mathematics — User’s Guide neqn files | nroff —=Tx eqn files >/dev/null will throw away the output but print the where x is the terminal type you are using. such as J00 or 300S. messages. If you use something like doilar signs EQN and NEQN can be used with the as delimiters, it is easy to leave one out. TBL program/(2] for setting tables that con- This causes very strange troubles. The program checkeq (on GCOS, use ./checkeg instead) checks for misplaced or missing tain mathematics. you get a message ‘‘word overflow’, you have exceeded this limit. sage will “line usually overflow’’ go away. indicates The message you have exceeded an even bigger buffer. The only cure for this is to break the equation into On a related topic, EQN does not break equations by itseif — vou must split long equations up across multiple lines by your- self, marking each by a separate .EQ ... .EN EQN does warn about equations that are too long to fit on one line. 25. - print a are deeply indebted to J. F. lingness to extend TROFF to make our task easier, and for his continuous assistance during the development and evolution of EQN. We are also grateful to A. V. Aho for for assistance with the YACC compiler- compiler, and to all the EQN users who have made helpful suggestions and criticisms. References [1] J. F. Ossanna, ‘‘NROFF/TROFF User's Manual’’, Bell Laboratories Computing Science Technical Report #5354, 1976. document that contains mathematics on the UNIX typesetter, eqn files | troff the TROFF part of the command. For example, eqn files | troff —ms To run the same document on the GCOS typesetter, use eqn files | troff —g (other options) | gcat A compatible version of EQN can be used on devices like teletypes and DAS! and GS! terminals which have half-line forward and reverse capabilities. 2] To print equations on a Mode} 37 teletype, for example, use neqn files | nroff The language for equations recognized by NEQN is identical to that of EQN. aithough of course the output is more restricted. To use a GSI or DASI terminal as the M. E. Lesk. “Typing Documents on UNIX’’, Bell Laboratories, 1976. [3] If there are any TROFF options, they go after output device. We Ossanna, the author of TROFF, for his wil- Use on UNIX To Acknowledgments advice on language design, to S. C. Johnson (WO separate ones. sequence. 26. If you print the equation as a displayed equation this mes- (NJEQN. tbl files | eqn | troff tbl files | negn | nroff dollar signs and similar troubles. In-line equations can only be so big because of an internal buffer in TROFF. If Use TBL before like this: | M. E. Lesk. “TBL — A Program for Setting Tables’, Bell Laboratories Computing Science Technical Report #49, 1976. Tbl 5-115 Tbl — A Program to Format Tables M. E. Lesk Bell Laboratories Murray Hill, New Jersey 07974 Introduction. Tl turns a simple description of a table into a troff ot nroff (1] program (list of commands) that prints the table. Tb/ may be used on the ppe-11 UNIX [2] system and on the Honeywell 6000 Gcos system. It attempts to isolate a portion of a job that it can successfully handle and leave the remainder for other programs. Thus r46/ may be used with the equation formatting program egn [3] or various layout macro packages (4,5,6]. but does not duplicate their functions. This memorandum is divided into two parts. First we give the rules for preparing b/ input: then some examples are shown. The description of rules is precise but technical, and the beginning user may prefer to read the examples first, as they show some common table arrangements. A section explaining how to invoke rb/ precedes the examples. To avoid repetition, henceforth read rroffas “troffoc nroff. " | The input to 14/ is text for a document, with tables preceded by a **.TS" (table start) command and followed by a **.TETM (table end) command. Tb/ processes the tables, generating troff formatting commands, and leaves the remainder of the text unchanged . The *“.TS'* and *.TETM lines are copied, too, so that rroff page layout macros (such as the memo formatting macros (4]) can use these lines to delimit and place tables as they see fit. In particular, any arguments on the **.TSTM or **.TE"TM lines are copied but otherwise document layout macro cammands. ignored. and may be used by | The format of the input is as follows: text TS table .T1E text 1S rable .TE {ext where the format of each table is as follows: IS options : formar . dara . 1.E Each table is independent, and must contain formatting information followed by the data to be rows of the table, muy be preceded by u few options that affect the entire table. description of tables is given in the next section. A detailed entered in the table. The formatting information, which describes the individual columns and 5-116 Thbl Input commands. As indicated above, a table contains, first, global options, then a format section describing the layout of the table entries, and then the data to be printed. The format and data are always required, but not the options. The various parts of the table are entered as follows: 1) OPTIONS. There may be a single line of options affecting the whole table. line must follow the If present, this .TS line immediately and must contain a list of option names separated by spaces, tabs, or commas, and must be terminated by a semicolon. allowable options are: center — center the table (default is left-adjust): expand — make the table as wide as the current line length; box — enclose the table in a box; allbox — enclose each item in the table in a box: doublebox - enclose the table in two boxes: tab (x) — use x instead of tab to separate data items. The linesize (7) — set lines or rules (e.g. from box) in » point type; delim (xy) — recognize x and y as the egn delimiters. The ¢bl program tries to keep boxed tables on one page by issuing appropriat e ‘‘need"’ (.ne) commands. These requests are calculated from the number of lines in the tables, and if there are spacing commands embedded in the input, these requests may rate; use normal rroff procedures, such as keep-release macros, in that case. be inaccu- The user who must have a multi-page boxed table should use macros designed for this purpose, as explained below under ‘Usage.’ 2) FORMAT. The format section of the table specifies the layout of the columns. Each line in this section corresponds to one line of the table (except that the last line correspond s to all following lines up to the next .T&, if any — see below), and each line contains a keyletter for each column of the table. It is good practice to separate the key letters for each column by spaces or tabs. Each key-letter is one of the following: Lor 1 toindicate a left-adjusted column entry: Ror r toindicate a right-adjusted column entry: Cor ¢ toindicate a centered column entry; Normn to indicate a numerical column entry, to be aligned with other numerical entries so that the units digits of numbers line up: Aor a to indicate an alphabetic subcolumn: all corresponding entries are aligned on the left, and positioned so that the widest is centered within the column (see example on page 12): Sor s to indicate a spanned heading, i.e. to indicate that the entry from the previous column continues across this column (not allowed for the first column, obvi- ously): or ~ to indicate a vertically spanned heading, i.e. to indicate that the entry from previous row continues down through this row. of the table, obviously). the (Not allowed for the first row When numerical alignment is specified, a location for the decimal point is sought. The rightmost dot (.) adjacent to a digit is used as a decimal point: if there is no dot adjoining a digit, the rightmost digit is used as a units digit; if no alignment is indicated, the item is However, the special non-printing character string \& may be used to override unconditionally dots and digits, or to align alphabetic data; this string centered in the column. lines up where a dot normally would, and then disappears from the final output. In the example below, the items shown at the left will be aligned (in a numerical column) as Tbhl 5-117 shown on the right: 13 13 4.2 4.2 26.4.12 abe abc 26.4.12 ab¢ abc\& 43\&3.22 433.22 749.12 749.12 Note: If numerical data are used in the same column with wider L or r type table entries, the widest number is centered relative to the wider L or ritems (L is used instead of | for readability; they have the same meaning as key-letters). Alignment within the numerical items is preserved. This is similar to the behavior of a type data, as explained above. However, alphabetic subcolumns (requested by the a key-letter) are always slightly indented relative to L items; if necessary, the column width is increased to force this. This is not true for n type entries. Warning: the n and a items should not be used in the same column. For readability, the key-letters describing each column should be separated by spaces. The end of the format section is indicated by a period. The layout of the key-letters in the format section resembles the layout of the actual data in the table. Thus a simple format might appear as: css l an. which specifies a table of three columns. The first line of the table contairs a heading cen- tered across all three columns; each remaining line contains a left-adjusted item in the first column followed by two columns of numerical data. A sample table in this format might be: Overall title [tem-a 34.22 Item-b 12.65 Items: ¢, d.e 23 Total 69.87 9.1 .02 5.8 14.92 There are some additional features of the key-letter system: Horizontal lines — A Kkey-letter may be replaced by *_’ (underscore) to indicate a hor- izontal line in place of the corresponding column entry, or by ‘="' to indicate a double horizontal line. If an adjacent column contains a horizontal line, or if there are vertical lines adjoining this column, this horizontal line is extended to meet the nearby lines. If any data entry is provided for this column, it is ignored and a warning message is printed. Vertical lines — A vertical bar may be placed between column key-letters. This will cause a vertical line between the corresponding columns of the table. A vertical bar to the left of the first key-letter or to the right of the last one produces a line at the edge of the table. If two vertical bars appear between key-letters, a double vertical line is drawn. Space between columns — A number may follow the key-letter. This indicates the amount of separation between this column and the next column. The number normally specifies the separation in ens (one en is about the width of the letter ‘n’)." If the ‘*expand’” option is used, then these numbers are multiplied by a constant such that the table is as wide as the current line length. The default column separation * More precisely, an en is 3 aumber of points (1 noint = /72 inch) equal to half the current type size. 5-118 Thl number is 3. If the separation is changed the worst case (largest space requested) governs. Vertical spanning — Normally, vertically spanned items extending over several rows of the table are centered in their vertical range. If a key-letter is followed by t or T, any corresponding vertically spanned item will begin at the top line of its range. Font changes — A key-letter may be followed by a string containing a font name or number preceded by the letter f or F. This indicates that the corresponding column should be in a different font from the default font (usually Roman). All font names are one or two letters; a one-letter font name should be separated from whatever follows by a space or tab. The single letters B, b, I, and i are shorter synonyms for fB and fI. Font change commands given with the table entries override these specifications. Point size changes - A key-letter may be followed by the letter p or P and a number to indicate the point size of the corresponding table entries. The number may be a signed digit, in which case it is taken as an increment or decrement from the current point size. If both a point size and a column separation value are given, one or more blanks must separate them. Vertical spacing changes — A key-letter may be followed by the letter v or V and a number to indicate the wvertical line spacing to be used within a multi-line corresponding table entry. The number may be a signed digit., in which case it is taken as an increment or decrement from the current vertical spacing. A column separation value must be separated by blanks or some other specification from a vertical spacing request. This request has no effect unless the corresponding table entry is a text block (see below). Column width indication — A key-letter may be followed by the letter w or W and a width value in parentheses. This width is used as a minimum column width. If the largest element in the column is not as wide as the width value given after the w, the largest element is assumed to be that wide. If the largest element in the column is wider than the specified value, its width is used. The width is also used as a default line length for included text blocks. WNormal rroff units can be used to scale the width value; if none are used, the default is ens. less integer the parentheses may be omitted. column, the last one given controls. If the width specification is a unit- If the width value is changed in a Equal width columns — A key-letter may be followed by the letter e or E to indicate equal width columns. All columns whose key-letters are followed by e or E are made the same width. This permits the user to get a group of regularly spaced columns. Note: The order of the above features is immaterial; they need not be separated by spaces, except as indicated above to avoid ambiguities involving point size and font changes. Thus a numerical column entry in italic font and 12 point type with a minimum width of 2.5 inches and separated by 6 ens from the next column could be specified as npl2w(2.51)f1 6 Alternative notation — Instead of listing the format of successive lines of a table on con- secutive lines of the format section, successive line formats may be given on the same line, separated by commas, so that the format for the example above might have been written: ¢css,lnn. Default — Column descriptors missing from the end of a format line are assumed to be L. The longest line in the format section, however, defines the number of columns in the table; extra columns in the data are ignored silently. PR N Thl 5-119 3) DATA. The data for the table are typed after the format. Normally, each table line is typed as one line of data. Very long input lines can be broken: any line whose last character is \ is combined with the following line (and the \ vanishes). The data for different columns (the table entries) are separated by tabs, or by whatever character has been specified in the option rabs option. There are a few special cases: Troff commands within tables — An input line beginning with a *." followed by anything but a number is assumed to be a command to roff and is passed through unchanged, retaining its position in the table. So, for example, space within a table may be produced by **.sp’’ commands in the data. Full width horizontal lines — An input line coataining only the character _ (underscore) or = (equal sign) is taken to be a single or double line, respectively. extending the full width of the rable. Single column horizontal lines — An input table enrry containing only the character _or = is taken to be a single or double line extending the full width of the columa. Such lines are extended to meet horizontal or vertical lines adjoining this column. To obtain these characters explicitly in a column, either precede them by \& or follow them by a space before the usual tab or newline. Short horizontal lines — An input table entry containing only the string \_ is taken to be a single line as wide as the contents of the column. It is not extended to meet adjoining lines. Repeated ckaracters — An input table entry containing only a string of the form \Rx where x is any character is replaced by repetitions of the character x as wide as the data in the column. The sequence of x's is not extended to mest adjoining ‘columns. Vertically spanned items — An input table entry containing only the character string \° indicates that the table entry immediately above spans downward over this row. It is equivalent to a table format key-letter of *°°, Text blocks — In order to include a block of text as a table entry, precede it by T{ and follow it by T}. Thus the sequence A | block of text T} ... is the way to enter, as a single entry in the table, something that cannot conveniently be typed as a simple string between tabs. Note that the T} end delimiter must begin a line; additional columns of data may follow after a tab oa the same line. See the example on page 10 for an illustration of included text blocks in a table. If more than twenty or thirty text blocks are used in a table, various limits in the rroff program are likely to be exceeded, producing diagnostics such as ‘too many string/macro names’ or ‘to0 many number registers.’ Text blocks are pulled out from the table, processed separately by rroff and replaced in the table as a solid block. If no line length is specified in the black of rexr itself, or in the table format, the default is to use L xC/(,V+1) where L is the current line length, C is the number of table columns spanned by the text. and V is the total number of columns in the table. The other parameters (point size. font. etc.) used in setting the block of text are those in effect at the beginning of the table (including the effect of the **.TSTM macro) and any table format specifications of size. spacing and font, using the p, v and ( modifiers to the column key-letters. within the text block itsell are also recognized, of course. Commands However, trof commands within the table data but not within the text block do not affect that block. 5-120 - Thbl Warnings: — Although any number of lines may be present in a table, only the first 200 lines are used in calculating the widths of the various columns. A multi-page table, of course, may be arranged as several single-page tables if this proves to be a prob- lem. Other difficulties with formatting may arise because, in the calculation of column widths all table entries are assumed to be in the font and size being used when the **.TS" command was encountered, except for font and size changes indi- cated (a) in the table format section and (b) within the table data (as in the entry \s +3\fIdata\fP\sO). Therefore, although arbitrary rroff requests may be sprinkled in a table, care must be taken to avoid confusing the width calculations; use requests such as *.ps’ with care. 4) ADDITIONAL COMMAND LINES. If the format of a table must be changed after many simi- lar lines, as with sub-headings or summarizations, the *‘.T&" (table continue) command can be used to change column parameters. The outline of such a table input is: TS options Sormat . data T& format . data T& format . data .TE as in the examples on pages 10 and 12. to its corresponding format line. Using this procedure, each table line can be close Warning: it is not possible to change the number of columns, the space between columns, the global options such as box, or the selection of columns to be made equal width. Usage. | On UNIX, b/ can be run on a simple table with the command tbl input-file | troff but for more complicated use, where there are several input files, and they contain equations and ms memorandum layout commands as well as tables, the normal command would be tbl file-1 file-2 . . .| eqn | troff —ms and, of course, the usual options may be used on the troff and eqgn commands. The usage for nroff is similar to that for troff, but only TELETYPE® Model 37 and Diablo-mechanism (DASI or GSs1) terminals can print boxed tables directly. For the convenience of users employing line printers without adequate driving tables or post-filters, there is a special —TX command line option to 4/ which produces output that does not have fractional line motions in it. The only other command line options recognized by rb/ are —ms and —mm which are turned into commands to fetch the corresponding macro files: usuaily it is more convenient to place these arguments on the rroff part of the command line. but they are accepted by rb/ as well. Note that when egn and 16/ are used together on the same file 16/ should be used first. If there are no equations within tables, either order works, but it is usually faster to run ¢/ first. since eqn normally produces a larger expansion of the input than r4/. However. if there are equations within tables (using the deflim mechanism in egn), rbi must be first or the output will be scrambled. Users must also beware of using equations in n-style columns: this is nearly Tbl 5-121 always wrong, since (b/ attempts to split numerical format items into two parts and this is not possible with equations. The user can defend against this by giving the defim(xx) table option; this prevents splitting of numerical columns within the delimiters. For example, if the egn delimiters are SS, giving delim(SS) a numerical column such as **1245 S+- 165" will be divided after 1245, not after 16. Tbl limits tables to twenty columns; however, use of more than 16 numerical columns may fail because of limits in rroff, producing the ‘too many number registers’' message. Trof number registers used by rb/ must be avoided by the user within tables: these include two-digit names from 31 to 99, and names of the forms #x, x+, x| -x, and x—, where x is any lower case letter. The names ##, #—, and #" are also used in certain circumstances. To conserve number register names, the n and a formats share a register; hence the restriction above that they may not be used in the same column. For aid in writing layout macros, 16/ defines a numbetl register TW which is the table width: it is defined by the time that the **.TE' macro is invoked and may be used in the expansion of that macro. More importantly, to assist in laying out multi-page boxed tables the macro [# is defined to produce the bottom lines and side lines of a boxed table, and then invoked at its end. By use of this macro in the page footer a multi-page table can be boxed. In particular, the ms macros can be used to print a multi-page boxed table with a repeated heading by giving the argument H to the **.TS" macro. [f the table start macro is written IS H a line of the form .TH must be given in the table after any table heading (or at the start if none). Material up to the **.TH" is placed at the top of each page of table; the remaining lines in the table are placed on several pages as required. Note that this is nor a feature of b/, but of the ms layout macros. Examples. Here are some examples illustrating features of r6l. represents a tab character. Input: TS box; @ in the input Output: | cce REP Language @ Authors @ Runs on Fortran @ Many @ Almost anything PL/1®IBM®360/370 CO®BTL®11/45,H6000,370 BLISS @ Carnegie-Mellon ® PDP-10,11 IDS @ Honeywell @ H6000 Pascal @ Stanford ® 370 .TE The symbol | Language Fortran Authors Many Runs on Almost anything PL/1 IBM 360/370 BLISS Carnegie-Mellon PDP-10,11 Pascal Stanford 370 C IDS BTL Honeywell 11/45.H6000.370 H6000 5-122 Thbl Input: Output: TS allbox; gz2 AT&T Common Stock | Year | Price | Dividend 1971 | 41-54 | $2.60 I 2 | 41-54 | 2.70 AT&T Common Stock 3 | 46-35 2.87 Year @ Price @ Dividend 1971 @41-54 ©82.60 2@41-54©2.70 3@46-55@2.87 4 | 40-53 5 | 45-52 TSI T 3.24 3.40 55 * (first quarter only) 4D40-53®3.24 5®@45-52®3.40 6@51-59@.95° .TE * (first quarter only) Input: Output: TS | box: glscslc l|l|n . Major New Major New York Bridges Bridge Brooklyn Manhattan : York Bridges _aJ rHew Tork Williamsburg Bndge Brooklyn ®J. A. Roebling®1595 Manhattan @G. Lindenthal ® 1470 | Triborough Length 1595 G. Lindenthal 1470 Palmer & 1182 L. L. Buck Queensborough Bridge @ Designer @ Length Designer J. A. Roebling Hornbostel | 0. H. Ammann 1600 s 383 Williamsburg @L. L. Buck®@® 1600 Bronx Whitestone _ O. H. Ammann 2300 Throgs Neck O. H. Ammann 1800 Queensboroug @ Paimer h & @ 1182 ® Hornbostel @ ®1380 Triborough®0. H. Ammann®_ @ @383 Bronx Whitestone@0O. H. Ammann ®2300 Throgs Neck®0. H. Ammann ® 1800 aeorge Washington®@0. H. Ammann® 3500 .TE George Washington | O. H. Ammann | 3500 Thl 5-123 Output: B S 15 6.5 A 23 s e Stack 2.1 QOutput: january january @ february @ march april @ may june @july @ Months august @september october @ november .TE @december | february april may june july august september october november march Months december 0-124 Tbl Input: Output: 1S Composition of Foods :i?;:s . Percent by Weight Composition of Foods “I& cless Food Protein | Fat Apples 4 5 Halibut 18.4 5.2 Carbo- hydrate 13.0 ¢cless Lima beans 7.5 ¢ .8 22.0 Milk 3.3 4.0 5.0 Mushrooms 3.3 4 lele le. Food @ Percent by Weight \"® Rye bread 9.0 .6 6.0 52.7 \" @ Protein @ Fat ® Carbo- \"@\" @\ " @hydrate T& | In In |n. Apples®.4®.5@13.0 Halibut®18.4®5.29. . . Lima beans®7.5®.8®22.0 Milk®3.3®4.0®5.0 Mushrooms ®3.5®@ .4®6.0 Rye bread @9.0® .6®52.7 .TE" Input: Output: 1S a;llbox; g z;w(sm ew(1i) Era | New York Area Rocks Formation Precambrian | Reading Prong 1p9 1p9 1p9. Paleozoic Manhattan Prong New York Area Rocks Mesozoic Ncwark Basin, Era @ Formatio ® Age n (years) incl. Stockton, Paleozoic @ Manhattan Prong @400 million sg’t?jxfz é‘;" Precamb @ rian Reading Prong ® > 1 billion Newark Basin, incl. Stockton, Lockatong, and Brunswick formations: also Watchungs | ’ T) ®200 million Cenozoic @ Coastal Plain @ T{ On Long Island 30,000 years: Cretaceous sediments redeposited by recent glaciation. .ad T} .TE | 400 million 200 million Lockatong, and fi:sozmc@'r( ' Age (years) > 1 billion Watchungs and , Cenozoic Palisades. — , Coastal Plain ?; Oolbong [sland Cretaceous sedi ment depo- sited by recone glaciation. Thl 5-125 Input: Output: -EQ Name Definition Gamma I"(:)-_I; r:=te~"dr Sine sin(x)--il-,,-(e“-e"") Error erf (:)--:/-z-_z _[; :e"'zdr delim SS .EN IS doublebox: cc 1. Name @ Definition Bessel | e -/o(Z)"‘f;L cos(zsind)de Zeta {(s)=3 k~* - .Sp (Res>1) (=) VS +2p Gam @ SGAMMA ma (z) = int sub 0 sup inf t sup {z-1} e sup -t dtS Sine @Ssin (x) = | over 2i (e sup ix - & sup -ix )S Error @S roman erf (z) = 2 over sqrt pi int sub 0 sup z e sup {-t sup 2} dt$S Bessel ®S J sub 0 (z) = 1 over pi int sub O sup pi cos ( z sin theta ) d theta S Zeta®S zeta (s) = sum from k=1 to inf k sup -s —~( Re"s > 1)$ VS <2p .TE Input: Output: TS box, tab(:): cbssss cp-2sssS clelelele r2lln2|n2|n2|n. " Readability of Text l:ne Width and Leading for 10-Point Type Line:Set: 1-Point: 2-Point : 4-Point Width : Solid: Leading : Leading : Leading 9 Pica:\<9.3:\-6.0:\-5.3:\-7.1 14 Pica:\-4.5:\-0.6:\-0.3:\-1.7 19 Pica:\-5.0:\-5.1 31 Pica:\-3.7:\-3.8:\-2.4:\-3.6 43 Pica:\-9.1:\-9.0:\ .TE Readability of Text Line Width und Leading for 10-Point Type Line || Set | 1-Point | 2-Point | 4-Point Width Il Solid | Leading | Leuding | Laading 9Picall =9.5| =60 | =53 | =71 14 Pical| -4.5| -=0.6 -0.3 -1.7 19 Pica || =5.0] =5.1 0.0 -2.0 ) 31 Pica || =3.7| 43 Pica ll =91 =38 —90 -4 —-59 -36 8.8 5-126 Thl Output: Input: TS Some London Transport Statistics cS (Year 1964) cip-2'§ I n an. Some London Transport Statistics (Year 1964) Railway route miles @ 244 Tube @ 66 Sub-surface @ 22 Surface @ 156 <D .Tp&. .S l’ c ar. Passenger traffic \- railway Journeys @ 674 million Average length®4.55 miles Passenger miles @ 3,066 million T& 'a'r Pas.senger traffic \-Cree road Journeys @ 2,252 million Average length @ 2.26 miles Passenger miles ® ' 5,094 million T& Railway route miles Tube Sub-surface Surface 244 66 22 156 . Pasjsenger traffic — railway ourneys Average length Passenger miles Passenger traffic - road Journeys Average length Passenger miles Vehicles 2,905 1,269 4,174 8,347 Staff Adr.nimsx.rauv?, etc. Civil engineering 73.739 3,582 5.134 Electrical en . Mech. eng. — railway Mech. eng. — road Railway operations Road operations ) SP .5 Other Total railway ®4,174 Omnibuses @ 8,347 J& in an. .Sp .5 Staff® 73,739 Administrative, etc. ® 5,582 Civil engineering @5,134 Electrical eng. @ 1,714 Mech. eng. \- railway®4.,310 Mech. eng. \- road ®9,152 Railway operations @ 8,930 Road operations @ 35,946 Other® 2,971 .TE 2.26 miles 5.094 million 12,521 l. 0 Railway motor cars @ 2,905 Railway trailer cars @ 1,269 2.252 million Railway motor cars Railway trailer cars Total railway Omnibuses an Vehicles® 12,521 - 674 mnll}on 4.55 r.m‘les 3.066 million 1 714 4,310 9.152 8.930 35.946 2,971 Tbl 5-127 Input: .ps 8 .vs 10p TS center box: cSsS ciss cce IB1n. ~ New Jersey Representatives (Democrats) SP o5 Name @ Office address @ Phone SP .5 James J. Florio®23 S. White Horse Pike, Somerdale 08083 @ 609-627-8222 William J. Hughes ® 2920 Atlantic Ave., Atlantic City 08401 ® 609-345-4844 James J. Howard ® 801 Bangs Ave., Asbury Park 07712® 201-774-1600 Frank Thompson, Jr. @ 10 Rutgers Pl., Trenton 08618 ©609-599-1619 Andrew Maguire® 115 W. Passaic St., Rochelle Park 07662 @ 201-843-0240 Robert A. Roe®U.S.P.O., 194 Ward St., Paterson 07510®201-523-5152 Henry Helstoski @ 666 Paterson Ave., East Rutherford 07073 ® 201-939-9090 Peter W. Rodino, Jr. @ Suite 1435A, 970 Broad St., Newark 07102 ® 201-645-3213 Joseph G. Minish @ 308 Main St., Orange 07050 ® 201-645-6363 Helen S. Meyner® 32 Bridge St., Lambertville 08530 @ 609-397-1830 Dominick V. Daniels ® 895 Bergen Ave., Jersey City 07306 ® 201-659-7700 Edward J. Patten ® Natl. Bank Bldg., Perth Amboy 08861 ® 201-826-4610 SP 5 J& ciss IBin. (Republicans) .SP .Sv Millicent Feawick @41 N. Bridge St., Somerville 08876 @ 201-722-8200 Edwin B. Forsythe ® 301 Mill St., Moorestown 08057 @ 609-235-6622 Matthew J. Rinaldo® 1961 Morris Ave., Union 07083 ®201-687-4235 .TE .ps 10 .vs 12p 5-128 Thl QOutput: New Jersey Representatives (Democrats) Office address Phone 23 S. White Horse Pike, Somerdale 08083 2920 Atlantic Ave., Atlantic City 08401 801 Bangs Ave., Asbury Park 07712 10 Rutgers Pi., Trenton 08618 115 W. Passaic St.. Rochelle Park 07662 U.S.P.O., 194 Ward St., Paterson 07510 666 Paterson Ave., East Rutherford 07073 Suite 1435A. 970 Broad St., Newark 07102 609-627-8222 609-345-4844 Name James J. Florio William J. Hughes James J. Howard Frank Thompson, Jr. Andrew Maguire Robert A. Roe Henry Helstoski Peter W. Rodino, Jr. Joseph G. Minish Helen S. Meyner Dominick V. Daniels Edward J. Patten 308 Main St., Orange 07050 32 Bridge St.. Lambertville 08530 895 Bergen Ave., Jersey City 07306 Natl. Bank Bldg.. Perth Amboy 08861 201-774-1600 609-599-1619 201-843-0240 201-523-5152 201-939-9090 201-645-3213 201-645-6363 609-397-1830 201-659-7700 201-826-4610 (Republicans) Millicent Fenwick Edwin B. Forsythe Matthew J. Rinzldo 41 N. Bridge St.. Somerville 08876 301 Milt St.. Moorestown 08057 1961 Morris Ave.. Union 07083 201-722-8200 609-235-6622 201-687-423S5 This is a paragraph of normal text placed here only to indicate where the left and right margins are. In this way the reader can judge the appearance of centered tables or expanded tables, and observe how such tables are formatted. Input: .19 expand; CSSS cccc ilnn. Bell Labs Locations Name @ Address @ Area Code @ Phone Holmdel @ Holmdel, N. J. 07733 ®201 @ 949-3000 Murray Hill @ Murray Hill, N. J. 07974©201 @ 582-6377 Whippany @ Whippany, N. J. 079810201 @ 386-3000 Indian Hill ® Naperville, Illinois 60540 ® 312 ®690-2000 -TE Output: Bell Labs Locations Name Holmdel Murray Hill Whippany Indian Hill Address Holmdel, N. J. 07733 Murray Hill, N. J. 07974 Whippany, N. J. 07981 Naperville, Illinois 60540 Area Code 201 201 201 312 Phone 949-3000 582-6377 386-3000 690-2000 Tbl 5-129 Input: o 13 box: ¢cb § s s clele s Beiw (1) | lew(2i) | 1p8 | Iw(1.6i)p8. Some [nteresting Places Name @ Description @ Practical Information T( American Museum of Natural History TI® T The collections fit 11.5 acres (Michelin) or 2§ acres (MTA) of exhibition halls on four floors. There is a full-sized replica of a blue whale and the world’s largest star sapphire (stolen in [964). TI® Hours® 10-$. ex. Sun 11-5, Wed. t0 9 \"®\" @ Location® T %;:mral Park West & 79th St. \"@\"® Admission D Donation: $1.00 asked \:@\:QSubwzyQ AA to 81st St. \"O\ D Telephone® 212-873-4225$ Broax Zoo® T| About 3 mile long and .8 mile wide, this is the largest z0oo in America. A lion euts |8 pounds of meat a day while a sea lion euts 1S pounds of fish. TI® Hours @T %%«4:30 winter, to 5:00 summer VOV @ Locati@on T .;ra’sm St. & Southern Blvd, the Bronx. \"D\"D Admission D $1.00, dut Tu.We.Th free "D\ @ SubwayD 2, § to Eust Tremont Ave. 'O\ @ Telephone D 212-933-1759 Brookiyn Museum @ T| Five floors of gulleries contain American and ancient art. There are American period rooms and architectural ornaments saved from wreckers, such as s classical figure from Pennsylvania Station. TI® Hours @ Wed-Sat. 10-5, Suna 12-$ \"@\" @ Locatio @ T n %mem Pacrkway & Washington Ave., Brookiyn. \:@\'@ Admission D Free \"@\" @ Subwuy@ 2.3 to Eustern Parkway. \"O\ O Telephone D 212-638-5000 T New-York Historical Society TIOT( All the original paintings for Audubon’s Birds of America are here, 28 are exhibits of American decorative arts, New York history, Hudson River school paintings. carriages. and glass paperweights. TI® Hours® T{ TiiifioF?i & Sun, 1-3; Sat 10-3 T \"O\ @ Locati@on T C’cnu:u Park West & 77th St. T \:@\:@ Admission @ Eree \.®\.®Subway® AA to 81st St. \T%\ @ Telephone D 212-873-3400 5-130 Thbl Output: Some Interesting Places Name American um of History Description Muse- | The collections fill Practical Information 11.5 acres | Hours 10-5. ex. Sun 11-5, Wed. to 9 Natural | (Michelin) or 25 acres (MTA) | Location Central Park West & T9th St. of exhibition halls on four | Admission | Donation: $1.00 asked floors. There is a full-sized re- | Subway AA to 81st St. plica of a blue whale and the | Telephone | 212-873-4225 world’s largest star sapphire (stolen in 1964). Bronx Zoo About 2 mile long and .6 mile | Hours 10-4:30 winter, to $:00 summer wide, this is the largest zoo in | Location 185th St. & Southern Bivd, the America. pounds A lion of meat a eats day 18 while a sea lion eats 1S pounds of fish. | | | Admission Bronx, | $1.00. but Tu,We,Th free Subway 2. 5 to East Tremont Ave. Telephone 212-933-1759 Brooklyn Museum | Five floors of galleries contain | Hours Wed-Sut, 10-$, Sun 12-§ American and ancient art. | Location Eastern Parkway & Washington There are American period Ave., Brooklyn. rooms and architectural orna- | Admission | Free ments saved from wreckers, | Subway 2.3 10 Eastern Parkway. such as a classical figure from | Telephone | 212-638-5000 Pennsylvania Station. New-York Hisror- | All ical Society the original paintings for | Hours Tues-Fri & Sun, 1-§; Sat 10-$ Audubon’s Birds of America are | Location Central Park West & 77th St. here, as are exhibits of Ameri- | Admission | Free can decorative arts, New York | Subway AA to 81st St. history, Hudson River school | Telephone | 212-873-3400 paintings, carriages, and glass paperweights. Acknowledgments. Many thanks are due to J. C. Blinn, who has done a large amount of testing and assisted with the design of the program. He has also written many of the more intelligible sentences in this document and helped edit all of it. All phototypesetting programs on UNIX are dependent on the work of the late J. F. Ossanna, whose assistance with this program in particular had been most helpful. This program is patterned on a table formatter originally written by J. F. Gimpel. The assistance of T. A. Dolotta, B. W. Kerighan, and J. N. Sturman is gratefully acknowledged. References. (11 J. F. Ossanna, NROFFITROFF User's Manual, Computing Science Technical Report No. 54, Bell Laboratories, 1976. (2] | K. Thompson and D. M. Ritchie, *“The UNix Time-Sharing System."” Comm. ACM. 17. pp. 365—=75 (1974). (3] B. W. Kernighan and L. L. Cherry, *“*A System for Typesetting Mathematics." Comm. ACM. 18, pp. 15157 (1975). [4] M. E. Lesk, Typing Documents on Unix, UNIX Programmesr's Manual, Volume 2. | Thl 5-131 (5] M. E. Lask and B. W. Kernighan, Computer Typesetting of Technical Journals on Untx, Proc. AFIPS NCC, vol. 46, pp. 879-888 (1977). (6] J. R. Mashey and D. W. Smith, **Documentation Tools und Techniques.” Proc. 2nd /. Conf. on Software Engineering, pp. 177-181 (October, 1976). List of Thl Command Characters and Words Vertical spanning at top Change data separator character Text block Vertical spacing change Minimum width value Included rroff command Vertical line Double vertical line Vertical span Yertical span Double horizontal line Horizontal line Short horizontal line Repeat character N center doublebox @ ¢cC Secrion N bB box Meaning Alphabetic subcolumn Draw box around all items Boldface item Draw box around table Centered column Center table in page Doubled box around table Equal width columns Make table full line width Font change [talic item Left adjusted column Numerical column Column s=paration Point size change Right adjusted column Spanned item & Command 2 A allbox Refer — A Bibliography System 5-133 Refer — A Bibliography System Bill Tuthill Computing Services University of California Berkeley, CA 94720 Introduction Taken together, the refer programs constitute a database system for use with variable-length information. To distinguish various types of bibliographic material, the system uses labels composed of upper case letters, preceded by a percent sign and followed by a space. For example, one document might be given this entry: %A Joel Kies %T Document Formatting on Unix Using the -ms Macros %1 Computing Services %C Berkeley %D 1980 Each line is called a field, and lines grouped together are called a record; records are separated from each other by a blank line. Bibliographic information follows the labels, containing data to be used by the refer system. The order of fields is not important, except that authors should be entered in the same order as they are listed on the document. Fields can be as long as necessary, and may even be continued on the following line(s). The labels are meaningful to nroff/troff macros, and, with a few exceptions, the refer program itself does not pay attention to them. This implies that you can change the label codes, if you also change the macros used by nroff/troff . The macro package takes care of details like proper order- ing, underlining the book title or journal name, and quoting the article’s title. Here are the labels used by refer, with an indication of what they represent: 5-134 Refer — A Bibliography System %H Header commentary, printed before reference %A Author’s name %Q Corporate or foreign author (unreversed) %T Title of article or book %S Series title %d dJournal containing article % B Book containing article %R Report, paper, or thesis (for unpublished material) %V Volume % N Number within volume %E Editor of book containing article %P Page number(s) %1 Issuer (publisher) % C City where published %D Date of publication %0 Other commentary, printed at end of reference %K Keywords used to locate reference %L Label used by —k option of refer %X Abstract (used by roffbib, not by refer) Only relevant fields should be supplied. Except for % A, each field should be given only once; in the case of multiple authors, the senior author should come first. The %Q is for organizational authors, or authors with Japanese or Arabic names, in which cases the order of names should be preserved. Books should be labeled with the % T, not with the % B, which is reserved for books containing articles. The %dJ and %B fields should never appear together, although if they do, the %J will override the %B. If there is no author, just an editor, it is best to type the editor in the %A field, as in this example: %A Bertrand Bronson, ed. The %E field is used for the editor of a book (% B) containing an article, which has its own author. For unpublished material such as theses, use the %R field; the title in the % T field will be quoted, but the contents of the %R field will not be underlined. Unlike other fields, % H, %0, and %X should contain their own punctuation. Here is a modest example: %A Mike E. Lesk %T %B %1 % C %D %V %K %X Some Applications of Inverted Indexes on the Unix System Unix Programmer’s Manual Bell Laboratories Murray Hill, NJ 1978 2a refer mkey inv hunt Difficult to read paper that dwells on indexing strategies, giving little practical advice about using X¥Brefer XP. Note that the author’s name is given in normal order, without inverting the surname; inversion is done automatically, except when %Q is used instead of %A. We use %X rather than %O for the commentary because we do not want the comment printed all the time. The %O and % H fields are printed by both refer and roffbib; the % X field is printed only by roffbib, as a detached annotation paragraph. Data Entry with Addbib The addbib program is for creating and extending bibliographic databases. the filename of your bibliography: You must give it Refer — A Bibliography System 5-135 % addbib database Every time you enter addbib, it asks if you want instructions. type RETURN. To get them, type y; to skip them, Addbib prompts for various fields, reads from the keyboard, and writes records con- taining the refer codes to the database. After finishing a field entry, you should end it by typing RETURN. If a field is too long to fit on a line, type a backslash (|) at the end of the line, and you will be able to continue on the following line. Note: the backslash works in this capacity only inside add- bib. A field will not be written to the database if nothing is entered into it. Typing a minus sign as the first character of any field will cause addbib to back up one field at a time. Backing up is the best way to add multiple authors, and it really helps if you forget to add something important. Fields not contained in the prompting skeleton may be entered by typing a backslash as the last character before RETURN. The following line will be sent verbatim to the database and addbib will resume ~with the next field. This is identical to the procedure for dealing with long fields, but with new fields, don’t forget the % key-letter. Finally, you will be asked for an abstract (or annotation), which will be preserved as the % X field. Type in as many lines as you need, and end with a control-D (hold down the CTRL button, then press the “d” key). This prompting for an abstract can be suppressed with the —a command line option. After one bibliographic record has been completed, addbib will ask if you want to continue. you do, type RETURN; to quit, type q or n (quit or no). editors to correct mistakes made while entering data. If It is also possible to use one of the system After the “Continue?” prompt, type any of the following: edit, ex, vi, or ed — you will be placed inside the corresponding editor, and returned to addbib afterwards, from where you can either quit or add more data. If the prompts normally supplied by addbib are not enough, are in the wrong order, or are too numerous, you can redefine the skeleton by constructing a promptfile. after the —p command line option. Create some file, to be named Place the prompts you want on the left side, followed by a single TAB (control-I), then the refer code that is to appear in the bibliographic database. Addbib will send the left side to the screen, and the right side, along with data entered, to the database. Printing the Bibliography Sortbib is for sorting the bibliography by author (% A) and date (% D), or by data in other It is quite useful for producing bibliographies and annotated bibliographies, which are seldom fields. entered in strict alphabetical order. It takes as arguments the names of up to 16 bibliography files, and sends the sorted records to standard output (the terminal screen), which may be redirected through a pipe or into a file. The —sKEYS flag to sortbib will sort by fields whose key-letters are in the KEYS string, rather than merely by author and date. that all such fields are to be used. Key-letters in KEYS may be followed by a ‘4’ to indicate The default is to sort by senior author and date (printing the senior author last name first), but —sA+D will sort by all authors and then date, and —sATD will sort on senior author, then title, and then date. Roffbib is for running off the (probably sorted) bibliography. It can handle annotated bibliographies — annotations are entered in the % X (abstract) field. Roffbib is a shell script that calls definitions refer —B and nroff —mbib. It wuses the macro /usr/lib/tmac/tmac.bib, which you can redefine if you know nroff and troff. print the that reside in Note that refer will % H and %O commentaries, but will ignore abstracts in the %X field; roffbib will print both fields, unless annotations are suppressed with the —x option. The following command sequence will lineprint the entire bibliography, organized alphabetically by author and date: | 5-136 Refer — A Bibliography System % sortbib database | roffbib | lIpr This is a good way to proofread the bibliography, or to produce a stand-alone bibliography at the end of a paper. Incidentally, roffbib accepts all flags used with nroff. For example: % sortbib database | roffbib —Tdtc —sl will make accent marks work on a DTC daisy-wheel printer, and stop at the bottom of every page for changing paper. The —n and —o flags may also be quite useful, to start page numbering at a selected point, or to produce only specific pages. Roffbib understands four command-line number registers, which are something like the two- letter number registers in —ms. The —rN1 argument will number references beginning at one (1); use another number to start somewhere besides one. The —rV2 flag will double-space the entire bibliography, while —rV1 will double-space the references, but single-space the annotation paragraphs. Finally, specifying —rL6i changes the line length from 6.5 inches to 6 inches, and saying —rO1i sets the page offset to one inch, instead of zero. (That’s a capital O after —r, not a zero.) Citing Papers with Refer The refer program normally copies input to output, except when it encounters an item of the form: 1 partial citation ] The partial citation may be just an author’s name and a date, or perhaps a title and a keyword, or maybe just a document number. Refer looks up the citation in the bibliographic database, and transforms it into a full, properly formatted reference. If the partial citation does not correctly identify a single work (either finding nothing, or more than one reference), a diagnostic message is given. If nothing is found, it will say “No such paper.” If more than one reference is found, it will say “Too many hits.” Other diagnostic messages can be quite cryptic; if you are in doubt, use checknr to verify that all your .[’s have matching .]’s. When everything goes well, the reference will be brought in from the database, numbered, and placed at the bottom of the page. This citation,! for example, was produced by: This citation, A lesk inverted indexes ] for example, was produced by The .[ and .] markers, in essence, replace the .FS and .FE of the —ms macros, and also provide a numbering mechanism. Footnote numbers will be bracketed on the the lineprinter, but superscripted on daisy-wheel terminals and in troff. In the reference itself, articles will be quoted, and books and journals will be underlined in nroff, and italicized in troff. Sometimes you need to cite a specific page number along with more general bibliographic You may have, for instance, a single document that you refer to several times, each time giving a different page citation. This is how you could get “p. 10” in the reference: material. 3| kies document formatting %P 10 ] The first line, a partial citation, will find the reference in your bibliography. The second line will Refer — A Bibliography System insert the page number into the final citation. 5-137 Ranges of pages may be specified as “% P 56-78. When the time comes to run off a paper, you will need to have two files: the bibliographic database, and the paper to format. Use a command line something like one of these: % refer —p database paperinrgff —ms % refer —p database paperltblnroff —ms % refer —p database paper|tb“neqn’nr0fi —ms If other preprocessors are used, refer should precede tbl, which must in turn precede eqn or negn. The —p option specifies a “private” database, which most bibliographies are. Refer’s Command-line Options Many people like to place references at the end of a chapter, rather than at the bottom of the page. The —e option will accumulate references until a macro sequence of the form { SLISTS ] 1s encountered (or until the end of file)‘. point, collapsing identical references. Refer will then write out all references collected up to that Warning: there is a limit (currently 200) on the number of references that can be accumulated at one time. It is also possible to sort references that appear at the end of text. The —sKEYS flag will sort references by fields whose key-letters are in the KEYS string, and permute reference numbers in the text accordingly. It is unnecessary to use —e with it, since —s implies —e. be followed by a ‘+’ to indicate that all such fields are to be used. Key-letters in KEYS may The default is to sort by senior author and date, but -=-sA+D will sort on all authors and then date, and —sA+T will sort by authors and then title. Refer can also make citations in what is known as the Social or Natural Sciences format. Instead of numbering references, the —! (letter ell) flag makes labels from the senior author’s last name and the year of publication. For example, a reference to the paper on Inverted Indexes cited above might appear as [Lesk1978a]. It is possible to control the number of characters in the last name, and the number of digits in the date. For instance, the command line argument —16,2 might produce a reference such as [Kernig78c]. Some bibliography standards shun both footnote numbers and labels composed of author and date, requiring some keyword to identify the reference. The —k flag indicates that, instead of numbering references, key labels specified on the %L line should be used to mark references. The —n flag means to not search the default reference file, located in /usr/dict/papers/Rv7man. Using this flag may make refer marginally faster. printing Jones, J. A. instead of J. A. Jones. the senior author. The —an flag will reverse the first n author names, Often —al is enough; this will reverse the names of only In some versions of refer there is also the —f flag to set the footnote number to some predetermined value; for example, —f23 would start numbering with footnote 23. Making an Index Once your database is large and relatively stable, it is a good idea to make an index to it, so that references can be found quickly and efficiently. The indxbib program makes an inverted index to the bibliographic database (this program is called pubindex in the Bell Labs manual). An inverted index could be compared to the thumb cuts of a dictionary — instead of going all the way through your bibliography, programs can move to the exact location where a citation is found. Indxbib itself takes a while to run, and you will need sufficient disk space to store the indexes. But once it has been run, access time will improve dramatically. several million characters can be indexed with no problem. Furthermore, large databases of The program is exceedingly simple to use: 5-138 Refer — A Bibliography System % indxbib database Be aware that changing your database will require that you run indxbib over again. If you don’t, you may fail to find a reference that really is in the database. Once you have built an inverted index, you can use lookbib to find references in the database. Lookbib cannot be used until you have run indxbib. When editing a paper, lookbib is very useful to make sure that a citation can be found as specified. It takes one argument, the name of the bibliography, and then reads partial citations from the terminal, returning references that match, or nothing if none match. Its prompt is the greater-than sign. % lookbib database > lesk inverted indexes %A Mike E. Lesk %'T Some Applications of Inverted Indexes on the Unix System %d Unix Programmer’s Manual %1 Bell Laboratories % C Murray Hill, NJ %D 1978 %V 2a % X Difficult to read paper that dwells on indexing strategies, giving little practical advice about using \ fBrefer\fP. > If more than one reference comes back, you will have to give a more precise citation for refer. Experiment until you find something that works; remember that it is harmless to overspecify. To get out of the lookbib program, type a control-D alone on a line; lookbib then exits with an “EQT” message. Lookbib can also be used to extract groups of related citations. For example, to find all the papers by Brian Kernighan found in the system database, and send the output to a file, type: % lookbib /usr/dict/papers/Ind > kern.refs > kernighan > EOT % cat kern.refs Your file, “kern.refs”, will be full of references. A similar procedure can be used to pull out all papers of some date, all papers from a given journal, all papers containing a certain group of keywords, etc. Refer Bugs and Some Solutions The refer program will mess up if there are blanks at the end of lines, especially the %A author line. Addbib carefully removes trailing blanks, but they may creep in again during editing. Use an editor command — g/ *$/s/// — to remove trailing blanks from your bibliography. Having bibliographic fields passed through as string definitions implies that interpolated strings (such as accent marks) must have two backslashes, so they can pass through copy mode intact. For instance, the word ‘““téléphone” would have to be represented: te\\*'le\\*'phone in order to come out correctly. instead. In the %X field, by contrast, you will have to use single backslashes This is because the %X field is not passed through as a string, but as the body of a para- graph macro. Another problem arises from authors with foreign names. When a name like “Valéry Giscard d’Estaing” is turned around by the —a option of refer, it will appear as “d’Estaing, Valéry Giscard,” rather than as “Giscard d’Estaing, Valéry.” To prevent this, enter names as follows: Refer — A Bibliography System 5-139 %A Vale\\*'ry Giscard\0d’Estaing %A Alexander Csoma\0de\0Ko\\*:ro\\*:s (The second is the name of a famous Hungarian linguist.) The backslash-zero is an nroff/troff request meaning to insert a digit-width space. against mis-sorting. It will protect against faulty name reversal, and also Footnote numbers are placed at the end of the line before the .[ macro. This line should be a line of text, not a macro. As an example, if the line before the .[ is a .R macro, then the .R will eat the footnote number. (The .R is an —ms request meaning change to Roman font.) In cases where the font needs changing, it is necessary to do the following: \flet al.\fR I awk aho kernighan weinberger ] Now the reference will be to Aho et al.2 The \fI changes to i‘talics, and the \fR changes back to Roman font. Both these requests are nroff/troff requests, not part of —ms. is added after this sequence, it will indeed appear in the output. If and when a footnote number Internal Details of Refer tem. You have already read everything you need to know in order to use the refer bibliography sys- The remaining sections are provided only for extra information, and in case you need to change the way refer works. | The output of refer is a stream of string definitions, one for each field in a reference. To create string names, percent signs are simply changed to an open bracket, and an [F string is added, contain- ing the footnote number. The %X, %Y and %Z fields are ignored; however, the annobib program changes the %X to an .AP (annotation paragraph) macro. The citation used above yields this inter- mediate output: ds [F 1 J- .ds [A Mike E. Lesk .ds [T Some Applications of Inverted Indexes on the Unix System .ds [J Unix Programmer’s Manual .ds [I Bell Laboratories .ds [C Murray Hill, NJ ds [D 1978 ds [V 2a ar [T 0 anr [A O ar [O 0 J[ These 1 journal-article string definitions are sent to nroff, which can use the —ms macros defined in /usr/lib/mx/tmac.xref to take care of formatting things properly. The initializing macro .]— precedes the string definitions, and the labeled macro .][ follows. that running a file twice through refer is harmless. These are changed from the input .[ and .] so The .][ macro, used to print the reference, is given a type-number argument, which is a numeric label indicating the type of reference involved. Here is a list of the various kinds of references: 5-140 Refer — A Bibliography System Field Value %0d 1 3 %B %R %G 2 %1 %M 5 none 0 Kind of Reference Journal Article Article in Book 4Report, Government Report Book Bell Labs Memorandum (undefined) Other The order listed above is indicative of the precedence of the various fields. In other words, a refer- ence that has both the %dJ and %B fields will be classified as a journal article. If none of the fields listed is present, then the reference will be classified as “other.” The footnote number is flagged in the text with the following sequence, where number is the footnote number: \([.number\*(.] The \*([+ and \*(.] stand for bracketing or superscripting. In nroff with low-resolution devices such as the lpr and a crt, footnote numbers will be bracketed. note numbers will be superscripted. In troff, or on daisy-wheel printers, foot- Punctuation normally comes before the reference number; this can be changed by using the —P (postpunctuation) option of refer. In some cases, it is necessary to override certain fields in a reference. For instance, each time a work is cited, you may want to specify different page numbers, and you may want to change certain fields. This citation will find the Lesk reference, but will add specific page numbers to the output, even though no page numbers appeared in the original reference. [ lesk inverted indexes %P T7-13 %1 Computing Services %0 UNX 12.2.2. ] The %1 line will also override any previous publisher information, and the % O line will append some commentary. The refer program simply adds the new %P, %I, and %O strings to the output, and later strings definitions cancel earlier ones. It is also possible to insert an entire citation that does not appear in the bibliographic database. This reference, for example, could be added as follows: A %A Brian Kernighan %'T A Troff Tutorial %1 %D Bell Laboratories 1978 ] This will cause refer to interpret the fields exactly as given, without searching the bibliographic database. This practice is not recommended, however, because it’s better to add new references to the database, so they can be used again later. If you want to change the way footnote numbers are printed, signals can be given on the .[ and .] lines. For example, to say ‘“See reference (2),” the citation should appear as: Refer — A Bibliography System 5-141 See reference [( partial citation D, Note that blanks are significant on these signal lines. If a permanent change in the footnote format is desired, it’s best to redefine the [. and .] strings. Changing the Refer Macros This section is provided for those who wish to rewrite or modify the refer macros. This is necessary in order to make output correspond to specific journal requirements, or departmental standards. First there is an explanation of how new macros can be substituted for the old ones. Then several alterations are given as examples. Finally, there is an annotated copy of the refer macros used by roffbib . The refer macros for nroff/troff supplied by the —ms macro package reside in /usr/lib/mx/tmac.xref; they are reference macros, for producing footnotes or endnotes. The refer macros used by roffbib, on the other hand, reside in /usr/lib/tmac/tmac.bib; they are for producing a stand-alone bibliography. To change the macros used by roffbib, you will need to get your own version of this shell script into the directory where you are working. These two commands will get you a copy of roffbib and the macros it uses: % cp /usr/lib/tmac/tmac.bib bibmac You can proceed to change bibmac as much as you like. Then when you use roffbib, you should specify your own version of the macros, which will be substituted for the normal ones % roffbib —m bibmac filename where filename is the name of your bibliography file. Make sure there’s a space between —m and bibmaec. If you want to modify the refer macros for use with nroff and the —ms macros, you will need to get a copy of “tmac.xref”: % cp /usr/lib/ms/s.ref refmac These macros are much like “bibmac”, except they have .FS and .FE requests, to be used in conjunction with the —ms macros, rather than independently defined .XP and .AP requests. Now you can put this line at the top of the paper to be formatted: .s0 refmac Your new refer macros will override the definitions previously read in by the —ms package. This method works only if “refmac” is in the working directory. Suppose you didn’t like the way dates are printed, and wanted them to be parenthesized, with no comma before. There are five identical lines you will have to change. The first line below is the old way, while the second is the new way: if PAF(DTM , \\*(ID\c if P\F(D” \& (\V([D)\c In the first line, there is a comma and a space, but no parentheses. The “\c” at the end of each line indicates to nroff that it should continue, leaving no extra space in the output. The “\&” in the second line is the do-nothing character; when followed by a space, a space is sent to the cutput. If you need to format a reference in the style favored by the Modern Language Association or Chicago University Press, in the form (city: publisher, date), then you will have to change the middle of the book macro [2 as follows: 5-142 Refer — A Bibliography System \& (\c it PA\FCTM \\*([C: \V([T\c .)i\f P\F(ID”” , \\*([D\c This would print (Berkeley: Computing Services, 1982) if all three strings were present. The first line prints a space and a parenthesis; the second prints the city (and a colon) if present; the third always prints the publisher (books must have a publisher, or else they’re classified as other); the fourth line prints a comma and the date if present; and the fifth line closes the parentheses. You would need to make similar changes to the other macros as well. Acknowledgements Mike Lesk of Bell Laboratories wrote the original refer software, including the indexing pro- grams. Al Stangenberger of the Forestry Department wrote the first version of addbib, then called bibin. Greg Shenaut of the Linguistics Department wrote the original versions of sortbib and roffbib. All these contributions are greatly appreciated. Some Applications of Inverted Indexes 5-143 Some Applications of Inverted Indexes on the UNIX System M. E. Lesk Bell Laboratories Murray Hill, New Jersey 07974 1. Introduction. The UNIXT system has many utilities (e.g. grep, awk, lex, egrep, fgrep, ...) to search through files of text, but most of them are based on a linear scan through the entire file, using some deterministic automaton. This memorandum discusses a program which uses inverted indexes! and can thus be used on much larger data bases. As with any indexing system, of course, there are some disadvantages; once an index is made, the files that have been indexed can not be changed without remaking the index. Thus applications are restricted to those making many searches of relatively stable data. Further- more, these programs depend on hashing, and can only search for exact matches of whole keywords. It is not possible to look for arithmetic or logical expressions (e.g. “date greater than 1970”) or for regular expression searching such as that in lex.? Currently there are two uses of this software, the refer preprocessor to format references, and the lookall command to search through all text files on the UNIX system. uses. The remaining sections of this memorandum discuss the searching programs and their Section 2 explains the operation of the searching algorithm and describes the data col- lected for use with the lookall command. The more important application, refer has a user’s description in section 3. Section 4 goes into more detail on reference files for the benefit of those who wish to add references to data bases or write new troff macros for use with refer. The options to make refer collect identical citations, or otherwise relocate and adjust references, are described in section 5. The UNIX manual sections for refer, lookall, and associated commands are attached as appendices. | 2. Searching. The indexing and searching process is divided into two phases, each made of two parts. These are shown below. A. Construct the index. (1) Find keys — turn the input files into a sequence of tags and keys, where each tag identifies a distinct item in the input and the keys for each such item are the strings under which it is to be indexed. (2) Hash and sort — prepare a set of inverted indexes from which, given a set of keys, the appropriate item tags can be found quickly. B. Retrieve an item in response to a query. T UNIX is a trademark of Bell Laboratories. 1 D. Knuth, The Art of Computer Programming: Vol. 3, Sorting and Searching, Addison-Wesley, Reading, Mass., 1977. See section 6.5. 2 M. E. Lesk, “Lex — A Lexical Analyzer Generator,” Comp. Sci. Tech. Rep. No. 39, Bell Laboratories, Murray Hill, New Jersey, October 1975. 5-144 Some Applications of Inverted Indexes (3) Search — Given some keys, look through the files prepared by the hashing and sorting facility and derive the appropriate tags. (4) Deliver — Given the tags, find the original items. This completes the searching process. The first phase, making the index, is presumably done relatively infrequently. course, be done whenever the data being indexed change. It should, of In contrast, the second phase, retrieving items, is presumably done often, and must be rapid. An effort is made to separate code which depends on the data being handled from code which depends on the searching procedure. The search algorithm is involved only in programs (2) and (3), while knowledge of the actual data files is needed only by programs (1) and (4). Thus it is easy to adapt to different data files or different search algorithms. To start with, it is necessary to have some way of selecting or generating keys from input files. For dealing with files that are basically English, we have a key-making program which automatically selects words and passes them to the hashing and sorting program (step 2). The format used has one line for each input item, arranged as follows: name:start,length (tab) keyl key2 key3 ... where name is the file name, start is the starting byte number, and length is the number of bytes in the entry. These lines are the only input used to make the index. The first field (the file name, byte position, and byte count) is the tag of the item and can be used to retrieve it quickly. Normally, an item is either a whole file or a section of a file delimited by blank lines. the tab, the second field contains the keys. After The keys, if selected by the automatic program, are any alphanumeric strings which are not among the 100 most frequent words in English and which are not entirely numeric (except for four-digit numbers beginning 19, which are accepted as dates). Keys are truncated to six characters and converted to lower case. selection is needed if the original items are very large. Some We normally just take the first n keys, with n less than 100 or so; this replaces any attempt at intelligent selection. One file in our system is a complete English dictionary; it would presumably be retrieved for all queries. To generate an inverted index to the list of record tags and keys, the keys are hashed and sorted to produce an index. associated with each key. What is wanted, ideally, is a series of lists showing the tags To condense this, what is actually produced is a list showing the tags associated with each hash code, and thus with some set of keys. To speed up access and further save space, a set of three or possibly four files is produced. File Contents entry Pointers to posting file posting Lists of tag pointers for tag Tags for each item key Keys for each item These files are: for each hash code each hash code (optional) The posting file comprises the real data: it contains a sequence of lists of items posted under each hash code. To speed up searching, the entry file is an array of pointers into the posting file, one per potential hash code. Furthermore, the items in the lists in the posting file are not referred to by their complete tag, but just by an address in the tag file, which gives the complete tags. The key file is optional and contains a copy of the keys used in the indexing. The searching process starts with a query, containing several keys. all items which were indexed under these keys. The goal is to obtain The query keys are hashed, and the pointers in the entry file used to access the lists in the posting file. These lists are addresses in the tag file of documents posted under the hash codes derived from the query. The common items from all lists are determined; this must include the items indexed by every key, but may also Some Applications of Inverted Indexes 5-145 contain some items which are false drops, since items referenced by the correct hash codes need not actually have contained the correct keys. Normally, if there are several keys in the query, there are not likely to be many false drops in the final combined list even though each hash code is somewhat ambiguous. The actual tags are then obtained from the tag file, and to guard against the possibility that an item has false-dropped on some hash code in the query, the original items are normally obtained from the delivery program (4) and the query keys checked against them by string comparison. Usually, therefore, the check for bad drops is made against the original file. However, if the key derivation procedure is complex, it may be preferable to check against the keys fed to program (2). In this case the optional key file which contains the keys associated with each item is generated, and the item tag is supplemented by a string | -start,length which indicates the starting byte number in the key file and the length of the string of keys for each item. This file is not usually necessary with the present key-selection program, since the keys always appear in the original document. There is also an option (-Cn) for coordination level searching. This retrieves items which match all but n of the query keys. The items are retrieved in the order of the number of keys that they match. Of course, n must be less than the number of query keys (nothing is retrieved unless it matches at least one key). As an example, consider one set of 4377 references, comprising 660,000 bytes. This included 51,000 keys, of which 5,900 were distinct keys. The hash table is kept full to save space (at the expense of time); 995 of 997 possible hash codes were used. The total set of index files (no key file) included 171,000 bytes, about 26% of the original file size. It took 8 minutes of processor time to hash, sort, and write the index. To search for a single query with the resulting index took 1.9 seconds of processor time, while to find the same paper with a sequential linear search using grep (reading all of the tags and keys) took 12.3 seconds of processor time. We have also used this software to index all of the English stored on our UNIX system. This is the index searched by the lookall command. On a typical day there were 29,000 files in our user file system, containing about 152,000,000 bytes. Of these 5,300 files, containing 32,000,000 bytes (about 21%) were English text. The total number of ‘words’ (determined mechanically) was 5,100,000. Of these 227,000 were selected as keys; 19,000 were distinct, hashing to 4,900 (of 5,000 possible) different hash codes. The resulting inverted file indexes used 845,000 bytes, or about 2.6% of the size of the original files. The particularly small indexes are caused by the fact that keys are taken from only the first 50 non-common words of some very long input files. Even this large lookall index can be searched quickly. For example, to find this document by looking for the keys “lesk inverted indexes” required 1.7 seconds of processor time and system time. By comparison, just to search the 800,000 byte dictionary (smaller than even the inverted indexes, let alone the 27,000,000 bytes of text files) with grep takes 29 seconds of processor time. The lookall program is thus useful when looking for a document which you believe is stored on-line, but do not know where. For example, many memos from our center are in the file system, but it is often difficult to guess where a particular memo might be (it might have several authors, each with many directories, and have been worked on by a secretary with yet more directories). Instructions for the use of the lookall command are given in the manual section, shown in the appendix to this memorandum. The only indexes maintained routinely are those of publication lists and all English files. To make other indexes, the programs for making keys, sorting them, searching the indexes, and delivering answers must be used. Since they are usually invoked as parts of higher-level commands, they are not in the default command directory, but are available to any user in the directory /usr/lib/refer. Three programs are of interest: mkey, which isolates keys from input files; inv, which makes an index from a set of keys; and hunt, which searches the index and 5-146 Some Applications of Inverted Indexes delivers the items. Note that the two parts of the retrieval phase are combined into one pro- gram, to avoid the excessive system work and delay which would result from running these as separate processes. | These three commands have a large number of options to adapt to different kinds of The user not interested in the detailed description that now follows may skip to sec- input. tion 3, which describes the refer program, a packaged-up version of these tools specifically oriented towards formatting references. Make Keys. The program mkey is the key-making program corresponding to step (1) Normally, it reads its input from the file names given as arguments, and if there in phase A. are no arguments it reads from the standard input. It assumes that blank lines in the input delimit separate items, for each of which a different line of keys should be generated. The lines of keys are written on the standard output. Keys are any alphanumeric string in the input not among the most frequent words in English and not entirely numeric (except that all-numeric strings are acceptable if they are between 1900 and 1999). In the output, keys are translated to lower case, and truncated to six characters in length; any associated punctuation is removed. The following flag arguments are recognized by mkey: —c name —f name Name of file of common words; default is /usr/lib/eign. Read a list of files from name and take each as an input argu- —ichars Ignore all lines which begin with ‘%’ followed by any character ment. in chars. —-kn —In - Use at most n keys per input item. —-nm Ignore items shorter than n letters long. Ignore as a key any word in the first m words of the list of —s Remove the labels (file:start,length) from the output; just give common English words. The default is 100. the keys. Used when searching rather than indexing. —W The normal Each whole file is a separate item; blank lines in files are irrelevant. arguments /usr/lib/eign, —n100, and —I3. for indexing references are the defaults, which For searching, the —s option is also needed. are —c When the big lookall index of all English files is run, the options are —w, —k50, and —f (filelist). When running on textual input, the mkey program processes about 1000 English words per processor second. Unless the —k option is used (and the input files are long enough for it to take effect) the output of mkey is comparable in size to its input. Hash and invert. files. The inv program computes the hash codes and writes the inverted It reads the output of mkey and writes the set of files described earlier in this section. It expects one argument, which is used as the base name for the three (or four) files to be Assuming an argument of Index (the default) the entry file is named Index.ia, the written. posting file Index.ib, the tag file Index.ic, and the key file (if present) Index.id. The inv program recognizes the following options: —a Append the new keys to a previous set of inverted files, mak- -d Write the optional key file. ing new files if there is no old set using the same base name. —~hn This is needed when you can not check for false drops by looking for the keys in the original inputs, i.e. when the key derivation procedure is complicated and the output keys are not words from the input files. The hash table size is n (default 997); n should be prime. Making n bigger saves search time and spends disk space. Some Applications of Inverted Indexes 5-147 —i[u] name Take input from file name, instead of the standard input; if u is present name is unlinked when the sort is started. Using this option permits the sort scratch space to overlap the disk space used for input keys. ] Make a completely new set of inverted files, ignoring previous —-p Pipe into the sort program, rather than writing a temporary -V Verbose mode; print a summary of the number of keys which files. input file. This saves disk space and spends processor time. finished indexing. About half the time used in inv is in the contained sort. Assuming the sort is roughly linear, however, a guess at the total timing for inv is 250 keys per second. The space used is usually of more importance: the entry file uses four bytes per possible hash (note the —h option), and the tag file around 15-20 bytes per item indexed. Roughly, the posting file con- tains one item for each key instance and one item for each possible hash code; the items are two bytes long if the tag file is less than 65336 bytes long, and the items are four bytes wide if the tag file is greater than 65536 bytes long. Note that to minimize storage, the hash tables should be over-full; for most of the files indexed in this way, there is no other real choice, since the entry file must fit in memory. Searching and Retrieving. | The hunt program retrieves items from an index. combines, as mentioned above, the two parts of phase (B): search and delivery. It The reason why it is efficient to combine delivery and search is partly to avoid starting unnecessary processes, and partly because the delivery operation must be a part of the search operation in any case. Because of the hashing, the search part takes place in two stages: first items are retrieved which have the right hash codes associated with them, and then the actual items are inspected to determine false drops, i.e. doesn’t really have the right keys. to determine if anything with the right hash codes Since the original item is retrieved to check on false drops, it is efficient to present it immediately, rather than only giving the tag as output and later retrieving the item again. If there were a separate key file, this argument would not apply, but separate key files are not common. Input to hunt is taken from the standard input, one query per line. be in mkey —s output format; all lower case, no punctuation. Each query should The hunt program takes one argument which specifies the base name of the index files to be searched. Only one set of index files can be searched at a time, although many text files may be indexed as a group, of course. If one of the text files has been changed since the index, that file is searched with fgrep; this may occasionally slow down the searching, and care should be taken to avoid having many out of date files. —a —-Cn The following option arguments are recognized by hunt: Give all output; ignore checking for false drops. Coordination level n; retrieve items with not more than n terms of the input missing; default CO, implying that each search term must be in the output items. —-Flynd] “—Fy” gives the text of all the items found; “—Fn” suppresses them. “—Fd” where d is an integer gives the text of the first d items. The default is —Fy. —g Do not use fgrep to search files changed since the index was made; print an error comment instead. —istring Take string as input, instead of reading the standard input. —-1n The maximum length of internal lists of candidate items is n; —o string Put text output (“—Fy”) in string; of use only when invoked default 1000. from another program. 5-148 Some Applications of Inverted Indexes —Pp Print hash code frequencies; mostly for use in optimizing hash table sizes. ~T[ynd] “—Ty” gives the tags of the items found; “—~Tn” suppresses “—Td” where d is an integer gives the first d tags. The them. default is —=Tn. —t string Put tag output (“—Ty”) in string; of use only when invoked from another program. The timing of hunt is complex. Normally the hash table is overfull, so that there will be many false drops on any single term; but a multi-term query will have few false drops on all terms. Thus if a query is underspecified (one search term) many potential items will be exam- ined and discarded as false drops, wasting time. If the query is overspecified (a dozen search terms) many keys will be examined only to verify that the single item under consideration has that key posted. below. The variation of search time with number of keys is shown in the table Queries of varying length were constructed to retrieve a particular document from the file of references. In the sequence to the left, search terms were chosen so as to select the desired paper as quickly as possible. In the sequence on the right, terms were chosen inefficiently, so that the query did not uniquely select the desired document until four keys had been used. The same document was the target in each case, and the final set of eight keys are also identical; the differences at five, six and seven keys are produced by measurement error, not by the slightly different key lists. Efficient Keys No. keys Inefficient Keys Total drops Retrieved Search time (incl. false) Documents (seconds) 1 15 3 1.27 2 1 1 3 1 No. keys Total drops Retrieved Search time (incl. false) Documents (seconds) 1 68 59 5.96 0.11 2 29 29 2.72 1 0.14 3 8 8 0.95 4 1 1 0.17 4 1 1 0.18 5) 1 1 0.19 3 1 1 0.21 6 1 1 0.23 6 1 1 0.22 7 1 1 0.27 7 1 1 0.26 8 1 1 0.29 8 1 1 0.29 As would be expected, the optimal search is achieved when the query just specifies the answer; however, overspecification is quite cheap. Roughly, the time required by hunt can be approxi- mated as 30 milliseconds per search key plus 75 milliseconds per dropped document (whether it is a false drop or a real answer). In general, overspecification can be recommended; it pro- tects the user against additions to the data base which turn previously uniquely-answered queries into ambiguous queries. The careful reader will have noted an enormous discrepancy between these times and the earlier quoted time of around 1.9 seconds for a search. The times here are purely for the search and retrieval: they are measured by running many searches through a single invocation of the hunt program alone. The normal retrieval operation involves using the shell to set up a pipeline through mkey to hunt and starting both processes; this adds a fixed overhead of about 1.7 seconds of processor time to any single search. Furthermore, remember that all these times are processor times: on a typical morning on our PDP 11/70 system, with about one dozen people logged on, to obtain 1 second of processor time for the search program took between 2 and 12 seconds of real time, with a median of 3.9 seconds and a mean of 4.8 seconds. Thus, although the work involved in a single search may be only 200 milliseconds, after you add the 1.7 seconds of startup processor time and then elapsed/processor time ratio, it will be 8 seconds before any response is printed. assume a 4:1 Some Applications of Inverted Indexes 5-149 3. Selecting and Formatting References for TROFF The major application of the retrieval software is refer, which is a troff preprocessor like egn .3 It scans its input looking for items of the form il imprecise citation where an imprecise citation is merely a string of words found in the relevant bibliographic citation. This is translated into a properly formatted reference. If the imprecise citation does not correctly identify a single paper (either selecting no papers or too many) a message is given. The data base of citations searched may be tailored to each system, and individual users may specify their own citation files. On our system, the default data base is accumulated from the publication lists of the members of our organization, plus about half a dozen personal bibliographies that were collected. The present total is about 4300 citations, but this increases steadily. Even now, the data base covers a large fraction of local citations. For example, the reference for the eqn paper above was specified as preprocessor like 1 eqn. A kernighan cherry acm 1975 y It scans its input looking for items This paper was itself printed using refer. The above input text was processed by refer as well as tbl and troff by the command refer memo-file | tbl | troff —ms and the reference was automatically translated into a correct citation to the ACM paper on mathematical typesetting. | The procedure to use to place a reference in a paper using refer is as follows. First, use the lookbib command to check that the paper is in the data base and to find out what keys are necessary to retrieve it. This is done by typing lookbib and then typing some potential queries until a suitable query is found. For example, had one started to find the eqn paper shown above by presenting the query $ lookbib kernighan cherry (EOT) lookbib would have found several items; experimentation would quickly have shown that the query given above is adequate. Overspecifying the query is of course harmless. A particularly careful reader may have noticed that “acm” does not appear in the printed citation; we have supplemented some of the data base items with common extra keywords, such as common abbreviations for journals or other sources, to aid in searching. If the reference is in the data base, the query that retrieved it can be inserted in the text, between .[ and .] brackets. If it is not in the data base, it can be typed into a private file of references, using the format discussed in the next section, and then the —p option used to search this private file. Such a command might read (if the private references are called myfile) 3 B. W. Kernighan and L. L. Cherry, “A System for Typesetting Mathematics,” Comm. Assoc. Comp. Mach., vol. 18, pp. 1561-157, Bell Laboratories, Murray Hill, New Jersey, March 1975. 5-150 Some Applications of Inverted Indexes refer —p myfile document | tbl | eqn | troff —ms . . . where tbl and/or eqn could be omitted if not needed. other macro package, however, is essential. The use of the —ms macros? or some Refer only generates the data for the references; exact formatting is done by some macro package, and if none is supplied the references will not be printed. By default, the references are numbered sequentially, and the —ms macros format references as footnotes at the bottom of the page. This memorandum is an example of that style. Other possibilities are discussed in section 5 below. 4. Reference Files. A reference file is a set of bibliographic references usable with refer. It can be indexed using the software described in section 2 for fast searching. What refer does is to read the input document stream, looking for imprecise citation references. It then searches through reference files to find the full citations, and inserts them into the document. The format of the full citation is arranged to make it convenient for a macro package, such as the —ms macros, to format the reference for printing. Since the format of the final reference is determined by the desired style of output, which is determined by the macros used, refer avoids forcing any kind of reference appearance. All it does is define a set of string registers which contain the basic information about the reference; and provide a macro call which is expanded by the macro package to format the reference. It is the responsibility of the final macro package to see that the reference is actually printed; if no macros are used, and the output of refer fed untranslated to troff, nothing at all will be printed. The strings defined by refer are taken directly from the files of references, which are in the following format. The references should be separated by blank lines. sequence of lines beginning with % and followed by a key-letter. Each reference is a The remainder of that line, and successive lines until the next line beginning with %, contain the information specified by the key-letter. In general, refer does not interpret the information, but merely presents it to the macro package for final formatting. A user with a separate macro package, for example, can add new key-letters or use the existing ones for other purposes without bothering refer. The meaning of the key-letters given below, in particular, is that assigned by the —ms macros. Not all information, obviously, is used with each citation. For example, if a document is both an internal memorandum and a journal article, the macros ignore the memoran- dum version and cite only the journal article. Some kinds of information are not used at all in printing the reference; if a user does not like finding references by specifying title or author keywords, and prefers to add specific keywords to the citation, a field is available which is searched but not printed (K). The key letters currently recognized by refer and —ms, with the kind of information implied, are: * M. E. Lesk, Typing Documents on UNIX and GCOS: The -ms Macros for Troff, 19717. Some Applications of Inverted Indexes 5-151 Key A B C D E G Information specified Author’s name Title of book containing item City of publication Date Editor of book containing item Government (NTIS) ordering number I Issuer (publisher) J Journal name K L Keys (for searching) Label Memorandum label M Key N O P R T V Information specified Issue number Other information Page(s) of article Technical report reference Title Volume number X Y or or Information not used by refer Z For example, a sample reference could be typed as: % T Bounds on the Complexity of the Maximal Common Subsequence Problem %74 ctr127 %A A. V. Aho %A D. S. Hirschberg %A J.D. Ulman %d J. ACM %V 23 %N 1 %P 1-12 % M abcd-78 %D Jan. 1976 Order is irrelevant, except that authors are shown in the order given. The output of refer is a stream of string definitions, one for each of the fields of each reference, as shown below. .- .ds [A authors’ names ... .ds [T title ... .ds [J journal ... .J[ type-number The special macro .]— precedes the string definitions and the special macro .}[ follows. These are changed from the input .[ and .] so that running the same file through refer again is harmless. The .]— macro can be used by the macro package to initialize. The .]J[ macro, which should be used to print the reference, is given an argument type-number to indicate the kind of reference, as follows: Value 1 Kind of reference Journal article 2 Book 3 Article within book 4 Technical report 5 Bell Labs technical memorandum Other 0 The reference is flagged in the text with the sequence \* ([.number\* (.] where number is the footnote number. The strings [. and .] should be used by the macro package to format the reference flag in the text. These strings can be repiaced for a particuilar footnote, as described in section 5. The footnote number (or other signal) is availablq to the 5-152 Some Applications of Inverted Indexes reference macro .][ as the string register [F. In some cases users wish to suspend the searching, and merely use the reference macro formatting. That is, the user doesn’t want to provide a search key between .[ and .] brackets, but merely the reference lines for the appropriate document. Alternatively, the user can wish to add a few fields to those in the reference as in the standard file, or override some fields. Altering or replacing fields, or supplying whole references, is easily done by inserting lines beginning with %; any such line is taken as direct input to the reference processor rather than keys to be searched. Thus i keyl key2 key3 ... % Q New format item %R Override report name ] makes the indicates changes to the result of searching for the keys. must be given before the first % line. All of the search keys If no search keys are provided, an entire citation can be provided in-line in the text. For example, if the eqn paper citation were to be inserted in this way, rather than by searching for it in the data base, the input would read preprocessor like I eqn. [ %A B. W. Kernighan %A L. L. Cherry %'T A System for Typesetting Mathematics %dJ Comm. ACM %V 18 %N 3 %P 151-157 %D March 1975 ] It scans its input looking for items This would produce a citation of the same appearance as that resulting from the file search. As shown, fields are normally turned into troff strings. Sometimes users would rather have them defined as macros, so that other troff commands can be placed into the data. When this is necessary, simply double the control character % in the data. Thus the input A %V 23 % %M Bell Laboratories, Murray Hill, N.J. 07974 ] is processed by refer into ds [V 23 .de [M Bell Laboratories, Murray Hill, N.J. 07974 The information after % %M is defined as a macro to be invoked by .[M while the Some Applications of Inverted Indexes 5-153 information after %V is turned into a string to be invoked by X*([V. At present —ms expects all information as strings. 5. Collecting References and other Refer Options Normally, the combination of refer and —ms formats output as troff footnotes which are consecutively numbered and placed at the bottom of the page. However, options exist to place the references at the end; to arrange references alphabetically by senior author; and to indicate references by strings in the text of the form [Namel975a] rather than by number. Whenever references are not placed at the bottom of a page identical references are coalesced. For example, the —e option to refer specifies that references are to be collected; in this case they are output whenever the sequence | [ SLISTS ] is encountered. Thus, to place references at the end of a paper, the user would run refer with the —e option and place the above $LIST$ commands after the last line of the text. will then move all the references to that point. Refer To aid in formatting the collected references, refer writes the references preceded by the line J< and followed by the line J> to invoke special macros before and after the references. Another possible option to refer is the —s option to specify sorting of references. default, of course, is to list references in the order presented. The The —s option implies the —e option, and thus requires a 1 $LIST$ ] entry to call out the reference list. The —s option may be followed by a string of letters, numbers, and ‘+’ signs indicating how the references are to be sorted. The sort is done using the fields whose key-letters are in the string as sorting keys; the numbers indicate how many of the fields are to be considered, with ‘+’ taken as a large number. —sAD meaning “Sort on senior author, then date.” Thus the default is To sort on all authors and then title, specify —sA+T. And to sort on two authors and then the journal, write —s A 2. Other options to refer change the signal or label inserted in the text for each reference. Normally these are just sequential numbers, and their exact placement (within brackets, as superscripts, etc.) is determined by the macro package. The —1 option replaces reference numbers by strings composed of the senior author’s last name, the date, and a disambiguating letter. If a number follows the 1 as in —13 only that many letters of the last name are used in the label string. To abbreviate the date as well the form -1lm,n shortens the last name to the ~ first m letters and the date to the last n digits. For example, the option —13,2 would refer to the egn paper (reference 3) by the signal Ker75a, since it is the first cited reference by Kernighan in 1975. A user wishing to specify particular labels for a private bibliography may use the —k option. Specifying —kx causes the field x to be used as a label. The default is L. If this field ends in —, that character is replaced by a sequence letter; otherwise the field is used exactly as given. If none of the refer-produced signals are desired, the —b option entirely suppresses automatic text signals. 5-154 Some Applications of Inverted Indexes If the user wishes to override the —ms treatment of the reference signal (which is nor- mally to enclose the number in brackets in nroff and make it a superscript in troff ) this can be done easily. If the lines .[ or .] contain anything following these characters, the remainders of these lines are used to surround the reference signal, instead of the default. Thus, for example, to say “See reference (2).” and avoid “See reference.?” the input might appear See reference imprecise citation ... D. Note that blanks are significant in this construction. If a permanent change is desired in the style of reference signals, however, it is probably easier to redefine the strings [. and .] (which are used to bracket each signal) than to change each citation. Although normally refer limits itself to retrieving the data for the reference, and leaves to a macro package the job of arranging that data as required by the local format, there are two special options for rearrangements that can not be done by macro packages. option puts fields into all upper case (CAPS-SMALL CAPS in troff output). The —¢ The key-letters indicated what information is to be translated to upper case follow the ¢, so that —cAdJ means that authors’ names and journals are to be in caps. The —a option writes the names of authors last name first, that is A. D. Hall, Jr. is written as Hall, A. D. Jr. The citation form of the Journal of the ACM, for example, would require both —¢A and —a options. This pro- duces authors’ names in the style KERNIGHAN, B. W. AND CHERRY, L. L. for the previous example. The —a option may be followed by a number to indicate how many author names should be reversed; —al (without any —¢ option) would produce Kernighan, B. W. and L. L. Cherry, for example. Finally, there is also the previously-mentioned —p option to let the user specify a private file of references to be searched before the public files. previously made index for these files. Note that refer does not insist on a If a file is named which contains reference data but is not indexed, it will be searched (more slowly) by refer using fgrep. In this way it is easy for users to keep small files of new references, which can later be added to the public data bases. Updating Publication Lists 5-155 Updating Publication Lists M. E. Lesk 1. Introduction. This note describes several commands to update the publication lists. The data base consisting of these lists is kept in a set of files in the directory /usr/dict/papers on the Version 7 UNIXt system. The reason for having special commands to update these files is that they are indexed, and the only reasonable way to find the items to be updated is to use the index. However, altering the files destroys the usefulness of the index, and makes further editing difficult. So the recommended procedure is to (1) Prepare additions, deletions, and changes in separate files. (2) Update the data base and reindex. Whenever you make changes, etc. it is necessary to run the “add & index” step before logging off; otherwise the changes do not take effect. The next section shows the format of the files in the data base. After that, the procedures for preparing additions, preparing changes, preparing deletions, and updating the public data base are given. 2. Publication Format. The format of a data base entry is given completely in “Some Applications of Inverted Indexes on UNIX” by M. E. Lesk, the first part of this report, and is summarized here via a few examples. In each example, first the output format for an item is shown, and then the corresponding data base entry. Journal article: A. V. Aho, D. J. Hirschberg, and J. D. Ullman, “Bounds on the Complexity of the Maximal Common Subsequence Problem,” J. Assoc. Comp. Mach., vol. 23, no. 1, pp. 1-12 (Jan. 1976). % T Bounds on the Complexity of the Maximal Common Subsequence Problem %A A. V. Aho %A D. S. Hirschberg %A J. D. Ullman %d J. Assoc. Comp. Mach. %N 23 %N 1 %P 1-12 %D Jan. 1976 %M Memo abcd... t UNIX is a trademark of Bell Laboratories. 5-156 Updating Publication Lists Conference proceedings: B. Prabhala and R. Sethi, “Efficient Computation of Expressions with Common Subexpressions,” Proc. 5th ACM Symp. on Principles of Programming Languages, pp. 222-230, Tucson, Ariz. (January 1978). %A B. Prabhala %A R. Sethi % T Efficient Computation of Expressions with Common Subexpressions %dJ Proc. 5th ACM Symp. on Principles of Programming Languages % C Tucson, Ariz. %D January 1978 %P 222-230 Book: B. W. Kernighan and P. J. Plauger, Software Tools, Addison-Wesley, Reading, Mass. (1976). %'T Software Tools %A B. W. Kernighan %A P. J. Plauger %1 Addison-Wesley % C Reading, Mass. %D 1976 Article within book: | J. W. de Bakker, “Semantics of Programming Languages,” pp. 173-227 in Advances in Information Systems Science, Vol. 2, ed. J. T. Tou, Plenum Press, New York, N. Y. (1969). %A J. W. de Bakker %'T Semantics of programming languages %E J. T. Tou % B Advances in Information Systems Science, Vol. 2 %1 Plenum Press %C New York, N. Y. %D 1969 %P 173-227 Technical Report: F. E. Allen, “Bibliography on Program Optimization,” Report RC5767, IBM T. J. Watson Research Center, Yorktown Heights, N. Y. (1975). %A F. E. Allen | %D 1975 %'T Bibliography on Program Optimization %R Report RC-5767 %1 IBM T. J. Watson Research Center % C Yorktown Heights, N. Y. Updating Publication Lists 5-157 Other forms of publication can be entered similarly. Note that conference proceedings are entered as if journals, with the conference name on a %<J line. This is also sometimes appropriate for obscure publications such as series of lecture notes. When something is both a report and an article, or both a memorandum and an article, enter all necessary information for both; see the first article above, for example. Extra information (such as “In preparation” or “Japanese translation”) should be placed on a line beginning % O. The most common use of %O lines now is for “Also in ...” to give an additional reference to a secondary appearance of the same paper. Some of the possible fields of a citation are: Letter Meaning Letter Meaning A Author K Extra keys B Book including item N Issue number C City of publication O D Date P Other Page numbers E Editor of book R Report number I Publisher (issuer) T Title of item J Journal name Vv Volume number Note that % B is used to indicate the title of a book containing the article being entered; when an item is an entire book, the title should be entered with a % T' as usual. Normally, the order of items does not matter. The only exception is that if there are multiple authors (% A lines) the order of authors should be that on the paper. If a line is too long, it may be continued on to the next line; any line not beginning with % or . (dot) is assumed to be a continuation of the previous line. example of a long title. Again, see the first article above for an Except for authors, do not repeat any items; if two %4dJ lines are given, for example, the first is ignored. Multiple items on the same file should be separated by blank lines. Note that in formatted printouts of the file, the exact appearance of the items is deter- mined by a set of macros and the formatting programs. tion, etc. by editing the data base; it is wasted effort. Do not try to adjust fonts, punctua- In case someone has a real need for a differently-formatted output, a new set of macros can easily be generated to provide alternative appearances of the citations. | 3. Updating and Re-indexing. This section describes the commands that are used to manipulate and change the data base. It explains the procedures for (a) finding references in the data base, (b) adding new references, (c) changing existing references, and (d) deleting references. Remember that all changes, additions, and deletions are done by preparing separate files and then running an ‘update and reindex’ step. Checking what’s there now. base. Often you will want to know what is currently in the data There is a special command lookbib to look for things and print them out. for articles based on words in the title, or the author’s name, or the date. ~ It searches For example, you could find the first paper above with lookbib aho ullman maximal subsequence 1976 or lookbib aho ullman hirschberg If you don’t give enough words, several items will be found; if you spell some wrong, nothing will be found. There are around 4300 papers in the public file; you should always use this command to check when you are not sure whether a certain paper is there or not. Additions. new papers. To add new papers, just type in, on one or more files, the citations for the Remember to check first if the papers are already in the data base. For example, 5-158 Updating Publication Lists if a paper has a previous memo version, this should be treated as a change to an existing entry, rather than a new entry. If several new papers are being typed on the same file, be sure that there is a blank line between each two papers. Changes. To change an item, it should be extracted onto a file. This is done with the command pub.chg keyl key2 key3 ... where the items keyl, key2, key3, etc. are a set of keys that will find the paper, as in the lookbib command. That is, if lookbib johnson yacc cstr will find a item (to, in this case, Computing Science Technical Report No. 32, “YACC: Yet | Another Compiler-Compiler,” by S. C. Johnson) then pub.chg johnson yacc cstr will permit you to edit the item. The pub.chg command extracts the item onto a file named “bibxxx” where “xxx” is a 3-digit number, e.g. “bib234”. The command will print the file name it has chosen. If the set of keys finds more than one paper (or no papers) an error message is printed and no file is written. Each reference to be changed must be extracted with a separate pub.chg command, and each will be placed on a separate file. You should then edit the “bibxxx” file as desired to change the item, using the UNIX editor. Do not delete or change the first line of the file, however, which begins % # and is a special code line to tell the update program which item is being altered. You may delete or change other lines, or add lines, as you wish. The changes are not actually made in the public data base until you run the update command pub.run (see below). Thus, if after extracting an item and modifying it, you decide that you’d rather leave things as they were, delete the “bibxxx” file, and your change request will disappear. Deletions. To delete an entry from the data base, type the command pub.del keyl key2 key3 ... where the items keyl, key2, etc. are a set of keys that will find the paper, as with the lookbib command. That is, if lookbib Aho hirschberg ullman | will find a paper, pub.del aho hirschberg ullman deletes it. Note that upper and lower case are equivalent in keys. The pub.del command will print the entry being deleted. It also gives the name of a “bibxxx” file on which the deletion command is stored. The actual deletion is not done until the changes, additions, etc. are processed, as with the pub.chg command. If, after seeing the item to be deleted, you change your mind about throwing it away, delete the “bibxxx” file and the delete request disappears. Again, if the list of keys does not uniquely identify one paper, an error message is given. Remember that the default versions of the commands described here edit a public data base. Do not delete items unless you are sure deletion is proper; usually this means that there are duplicate entries for the same paper. Otherwise, view requests for deletion with skepticism; even if one person has no need for a particular item in the data base, someone else may want it there. | If an item is correct, but should not appear in the “List of Publications” as normally produced, add the line %K DNL to the item. This preserves the item intact, but implies “Do Not List” to the to the commands that print publication lists. The DNL line is normally used for some technical reports, Updating Publication Lists 5-159 minor memoranda, or other low-grade publications. Update and reindex. When you have completed a session of changes, you should type the command pub.run filel file2 ... where the names “filel”, ... are the new files of additions you have prepared. You need not list the “bibxxx” files representing changes and deletions; they are processed automatically. All of the new items are edited into the standard public data base, and then a new index is made. This process takes about 15 minutes; during this time, searches of the data base will be slower. Normally, you should execute pub.run just before you logoff after performing some edit requests. However, if you don’t, the various change request files remain in your directory until you finally do execute pub.run. When the changes are processed, the “bibxxx” files are deleted. It is not desirable to wait too long before processing changes, however, to avoid conflicts with someone else who wishes to change the same file. If executing pub.run produces the message “File bibxxx too old” it means that someone else has been editing the same file between the time you prepared your changes, and the time you typed pub.run. You must delete such old change files and re-enter them. Note that although pub.run discards the “bibxxx” files after processing them, your files of additions are left around even after pub.run is finished. If they were typed in only for purposes of updating the data base, you may delete them after they have been processed by pub.run. Example. (1) Suppose, for example, that you wish to Add to the data base the memos “The Dilogarithm Function of a Real Argument” by R. Morris, and “UNIX Software Distribution by Communication Link,” by M. E. Lesk and A. S. Cohen; (2) Delete from the data base the item ‘“Cheap Typesetters”, by M. E. Lesk, SIGLASH Newsletter, 1973; and (3) Change “J. Assoc. Comp. Mach.” to “Jour. ACM” in the citation for Aho, Hirschberg, and Ullman shown above. The procedure would be as follows. First, you would make a file containing the additions, here called “new.1”, in the normal way using the UNIX editor. In the script shown below, the computer prompts are in italics. $ ed new.1 2 a %'T The Dilogarithm Function of a Real Argument % A Robert Morris %M abcd %D 1978 % T UNIX Software Distribution by Communication Link %A M. E. Lesk %A A. S. Cohen %M abced %D 1978 w new.l1 199 Crng ommand: L9 QL S~ N by - (") i las Pt [ nd & o ) a >} (oD D - o - S ioN o o <N b - I [ € >3 = ok >} c+ [ w o Next you would specify the . q 5-160 Updating Publication Lists $ pub.del lesk cheap typesetters siglash to which the computer responds: Will delete: (file bib176) % T Cheap Typesetters %A M. E. Lesk %J ACM SIGLASH Newsletter %V 6 % N 4 %P 14-16 %D October 1973 And then you would extract the Aho, Hirschberg and Ullman paper. The dialogue involved is shown below. First run pub.chg to extract the paper; it responds by printing the citation and informing you that it was placed on file bib123. That file is then edited. Updating Publication Lists 5-161 $ pub.chg aho hirschberg ullman Extracting as file bib123 % T Bounds on the Complexity of the Maximal Common Subsequence Problem %A A. V. Aho %A D. S. Hirschberg %A J. D. Ulman % J. Assoc. Comp. Mach. %V 23 %N 1 %P 1-12 % M abcd %D Jan. 1976 $ ed bib123 312 /Assoc/s/ J/ Jour/p % Jour. Assoc. Comp. Mach. s/Assoc.*/ACM/p % Jour. ACM L3p | %t /usr/dict/papers/p76 233 245 change % T Bounds on the Complexity of the Maximal Common Subsequence Problem %A A. V. Aho %A D. S. Hirschberg %A J. D. Ulman % Jour. ACM %V 23 %N 1 %P 1-12 %M abcd %D Jan. 1976 292 q $ Finally, execute pub.run, making sure to remember that you have prepared a new file “new.1””: $ pub.run new.1 and about fifteen minutes later the new index would be complete and all the changes would be included. | 4. Printing a Publication List There are two commands for printing a publication list, depending on whether you want to print one person’s list, or the list of many people. To print a list for one person, use the pub.indiv command: pub.indiv M Lesk This runs off the list for M. Lesk and puts it in file “output”. the initial. In case of ambiguity two initials can be used. Note that no ‘.’ is given after Similarly, to get the list for group of 5-162 Updating Publication Lists people, say pub.org xxx which prints all the publications of the members of organization xxx, taking the names for the list in the file /usr/dict/papers/centlist/xxx. This command should normally be run in the background; it takes perhaps 15 minutes. Two options are available with these commands: pub.indiv —p M Lesk prints only the papers, leaving out unpublished notes, patents, etc. Also pub.indiv —t M Lesk | gcat prints a typeset copy, instead of a computer printer copy. In this case it has been directed to an alternate typesetter with the ‘gcat’ command. These options may be used together, and may be used with the pub.org command as well. For example, to print only the papers for all of organization zzz and typeset them, you could type pub.center —t —p zzz | gcat & These publication lists are printed double column with a citation style taken from a set of publication list macros; the macros, of course, can be changed easily to adjust the format of the lists. The Style and Diction Programs 5-163 Writing Tools - The STYLE and DICTION Programs L. L. Cherry Bell Laboratories Murray Hill, New Jersey 07974 W. Vesterman Livingston College Rutgers University 1. Introduction Computers have become important in the document preparation process, with programs to check for spelling errors and to format documents. As the amount of text stored on line increases, it becomes feasible and attractive to study writing style and to attempt to help the writer in producing readable documents. The system of writing tools described here is a first step toward such help. The system includes programs and a data base to analyze writing style at the word and sentence level. We use the term “style” in this paper to describe the results of a writer’s particular choices among individual words and sentence forms. Although many judgements of style are subjective, particularly those of word choice, there are some objective measures that experts agree lead to good style. Three programs have been written to measure some of the objectively definable characteristics of writing style and to identify some commonly misused or unnecessary phrases. Although a document that conforms to the stylistic rules is not guaranteed to be coherent and readable, one that violates all of the rules is likely to be difficult or tedious to read. The program STYLE calculates readability, sentence length variability, sentence type, word usage and sentence openers at a rate of about 400 words per second on a PDP11/70 running the UNIXt Operating System. It assumes that the sentences are well-formed, i. e. that each sentence has a verb and that the subject and verb agree in number. DICTION identifies phrases that are either bad usage or unnecessarily wordy. EXPLAIN acts as a thesaurus for the phrases found by DICTION. Sections 2, 3, and 4 describe the programs; Section 5 gives the results on a cross-section of technical documents; Section 6 discusses accuracy and problems; Section 7 gives implementation details. 2, STYLE The program STYLE reads a document and prints a summary of readability indices, sentence length and type, word usage, and sentence openers. It may also be used to locate all sentences in a document longer than a given length, of readability index higher than a given number, those containing a passive verb, or those beginning with an expletive. STYLE is based on the system for finding English word classes or parts of speech, PARTS [1]. PARTS is a set of programs that uses a small dictionary (about 350 words) and suffix rules to partially assign word classes to English text. It then uses experimentally derived rules of word order to assign word classes to all words in the text with an accuracy of about 95%. Because PARTS uses only a small dictionary and general rules, it works on text about any subject, from physics to psychology. Style measures have been built into the output phase of the programs that make up PARTS. Some of the measures are simple counters of the word classes found by PARTS; many are more complicated. For example, the verb count is the total number of verb T UNIX is a trademark of Bell Laboratories. 5-164 The Style and Diction Programs phrases. This includes phrases like: has been going was only going to go each of which each counts as one verb. Figure 1 shows the output of STYLE run on a paper by Kernighan and Mashey about the UNIX programming environment [2]. programming environment readability grades: (Kincaid) 12.3 (auto) 12.8 (Coleman-Liau) 11.8 (Flesch) 13.5 (46.3) sentence info: no. sent 335 no. wds 7419 av sent leng 22.1 av word leng 4.91 no. questions 0 no. imperatives 0 no. nonfunc wds 4362 58.8% av leng 6.38 short sent (<17) 35% (118) long sent (>32) 16% (55) longest sent 82 wds at sent 174; shortest sent 1 wds at sent 117 sentence types: simple 34% (114) complex 32% (108) compound 12% (41) compound-complex 21% (72) word usage: verb types as % of total verbs tobe 45% (373) aux 16% (133) inf 14% (114) passives as % of non-inf verbs 20% (144) types as % of total prep 10.8% (804) conj 3.5% (262) adv 4.8% (354) noun 26.7% (1983) adj 18.7% (1388) pron 5.3% (393) nominalizations 2 % (155) . sentence beginnings: subject opener: noun (63) pron (43) pos (0) adj (58) art (62) tot 67% prep 12% (39) adv verb 0% (1) expletives 9% (31) sub conj 6% (20) conj 1% (5) 4% (13) Figure 1 As the example shows, STYLE output is in five parts. After a brief discussion of sentences, we will describe the parts in order. 2.1. What is a sentence? Readers of documents have little trouble deciding where the sentences end. even have to stop and think about uses of the character Jones, Ph.D., 1. e., or etc. . is not as easy. €6 99 People don’t in constructions like 1.25, A. J. When a computer reads a document, finding the end of sentences First we must throw away the printer’s marks and formatting commands that litter the text in computer form. Then STYLE defines a sentence as a string of words ending in one of: A2 The end marker “/.” may be used to indicate an imperative sentence. that are not so marked are not identified as imperative. Imperative sentences STYLE properly handles numbers with embedded decimal points and commas, strings of letters and numbers with embedded The Style and Diction Programs 5-165 decimal points used for naming computer file names, and the common abbreviations listed in Appendix 1. Numbers that end sentences, like the preceding sentence, cause a sentence break if the next word begins with a capital letter. Initials only cause a sentence break if the next word begins with a capital and is found in the dictionary of function words used by PARTS. So the string J. D. JONES does not cause a break, but the string ... system H. does. The ... With these rules most sentences are broken at the proper place, although occasionally either two sentences are called one or a fragment is called a sentence. More on this later. 2.2. Readability Grades The first section of STYLE output consists of four readability indices. As Klare points out in [3] readability indices may be used to estimate the reading skills needed by the reader to understand a document. The readability indices reported by STYLE are based on meas- ures of sentence and word lengths. Although the indices may not measure whether the docu- ment is coherent and well organized, experience has shown that high indices seem to be indi- cators of stylistic difficulty. Documents with short sentences and short words have low scores; those with long sentences and many polysyllabic words have high scores. The 4 formulae reported are Kincaid Formula [4], Automated Readability Index [5], Coleman-Liau Formula [6] and a normalized version of Flesch Reading Ease Score [7]. they The formulae differ because were experimentally derived using different texts and subject groups. We will discuss each of the formulae briefly; for a more detailed discussion the reader should see [3]. The Kincaid Formula, given by: Reading Grade=11.8*syl per wd s.39*wds per sent-15.59 was based on Navy training manuals that ranged in difficulty from 5.5 to 16.3 in reading grade level. The score reported by this formula tends to be in the mid-range of the 4 scores. Because it is based on adult training manuals rather than school book text, this formula is probably the best one to apply to technical documents. Th@ Automated Readability Index (ARI), based on text from grades 0 to 7, was derived to be easy to automate. The formula is: Reading Grade=4.71%*let per wd s.5*wds per sent-21.43 ARI tends to produce scores that are higher than Kincaid and Coleman-Liau but are usually slightly lower than Flesch. The Coleman-Liau Formula, based on text ranging in difficulty from .4 to 16.3, is: Reading Grade=5.89*let per wd-.3*sent per 100 wds-15.8 Of the four formulae this one usually gives the lowest grade when applied to technical documents. The last formula, the Flesch Reading Ease Score, is based on grade school text covering grades 3 to 12. The formula, given by: Reading Score 206.835-84.6*syl per wd-1.015*wds per sent is usually reported in the range 0 (very difficult) to 100 (very easy). The score reported by STYLE is scaled to be comparable to the other formulas, except that the maximum grade level reported is set to 17. The Flesch score is usually the highest of the 4 scores on technical documents. Coke [8] found that the Kincaid Formula is probably the best predictor for technical documents; both ARI and Flesch tend to overestimate the difficulty; Coleman-Liau tend to 5-166 The Style and Diction Programs underestimate. same. On text in the range of grades 7 to 9 the four formulas tend to be about the On easy text the Coleman-Liau formula is probably preferred since it is reasonably accurate at the lower grades and it is safer to present text that is a little too easy than a little too hard. If a document has particularly difficult technical content, especially if it includes a lot of mathematics, it is probably best to make the text very easy to read, i.e. a lower readability index by shortening the sentences and words. This will allow the reader to concentrate on the technical content and not the long sentences. The user should remember that these indices are estimators; they should not be taken as absolute numbers. STYLE called with “—r number” will print all sentences with an Automated Readability Index equal to or greater than “number”. 2.3. Sentence length and structure The next two sections of STYLE output deal with sentence length and structure. Almost all books on writing style or effective writing emphasize the importance of variety in sentence length and structure for good writing. Ewing’s first rule in discussing style in the book Writing for Results [9] is: “Vary the sentence structure and length of your sentences.” Leggett, Mead and Charvat break this rule into 3 in Prentice-Hall Handbook for Writers [10] as follows: “34a. Avoid the overuse of short simple sentences.” “34b. Avoid the overuse of long compound sentences.” “34c. Use various sentence structures to avoid monotony and increase effectiveness.” Although experts agree that these rules are important, not all writers follow them. Sample technical documents have been found with almost no sentence length or type variability. One document had 90% of its sentences about the same length as the average; another was made up almost entirely of simple sentences (80% ). The output sections labeled “sentence info” and “sentence types” give both length and structure measures. STYLE reports on the number and average length of both sentences and words, and number of questions and imperative sentences (those ending in “/.”’). The meas- ures of non-function words are an attempt to look at the content words in the document. In English non-function words are nouns, adjectives, adverbs, and non-auxiliary verbs; function words are prepositions, conjunctions, articles, and auxiliary verbs. are short, they tend to lower the average word length. Since most function words The average length of non-function words may be a more useful measure for comparing word choice of different writers than the total average word length. length variability. The percentages of short and long sentences measure sentence Short sentences are those at least 5 words less than the average; long sen- tences are those at least 10 words longer than the average. Last in the sentence information section is the length and location of the longest and shortest sentences. If the flag “—I1 number” is used, STYLE will print all sentences longer than “number”. Because of the difficulties in dealing with the many uses of commas and conjunctions in English, sentence type definitions vary slightly from those of standard textbooks, but still measure the same constructional activity. 1. A simple sentence has one verb and no dependent clause. 2. A complex sentence has one independent clause and one dependent clause, each with one verb. Complex sentences are found by identifying sentences that contain either a subordinate conjunction or a clause beginning with words like “that” or “who”. The preceding sentence has such a clause. 3. A compound sentence has more than one verb and no dependent clause. joined by ‘“;” €699 are also counted as compound. Sentences The Style and Diction Programs 4. 5-167 A compound-complex sentence has either several dependent clauses or one dependent clause and a compound verb in either the dependent or independent clause. Even using these broader definitions, simple sentences dominate many of the technical documents that have been tested, but the example in Figure 1 shows variety in both sentence structure and sentence length. 2.4. Word Usage The word usage measures are an attempt to identify some other constructional features of writing style. There are many different ways in English to say the same thing. structions differ from one another in the form of the words used. The con- The following sentences all convey approximately the same meaning but differ in word usage: The cxio program is used to perform all communication between the systems. The cxio program performs all communications between the systems. The cxio program is used to communicate between the systems. The cxio program communicates between the systems. All communication between the systems is performed by the cxio program. The distribution of the parts of speech and verb constructions helps identify overuse of par- ticular constructions. problem areas. Although the measures used by STYLE are crude, they do point out For each category, STYLE reports a percentage and a raw count. In addition to looking at the percentage, the user may find it useful to compare the raw count with the number of sentences. If, for example, the number of infinitives is almost equal to the number of sentences, then many of the sentences in the document are constructed like the first and third in the preceding example. The user may want to transform some of these sentences into another form. Some of the implications of the word usage measures are discussed below. Verbs are measured in several different ways to try to determine what types of verb constructions are most frequent in the document. Technical writing tends to contain many pas- sive verb constructions and other usage of the verb “to be”. The category of verbs labeled “tobe’” measures both passives and sentences of the form: subject tobe predicate In counting verbs, whole verb phrases are counted as one verb. auxiliary verbs are counted in the category “aux”. Verb phrases containing The verb phrases counted here are those whose tense is not simple present or simple past. It might eventually be useful to do more detailed measures of verb tense or mood. Infinitives are listed as “inf”. The percentages reported for these three categories are based on the total number of verb phrases found. These categories are not mutually exclusive; they cannot be added, since, for example, “to be going” counts as both “tobe” and “inf”. Use of these three types of verb constructions varies significantly among authors. STYLE reports passive verbs as a percentage of the finite verbs in the document. Most style books warn against the overuse of passive verbs. Coleman [11] has shown that sentences with active verbs are easier to learn than those with passive verbs. inverted object-subject order of the passive voice seems to Although the emphasize the object, Coleman’s experiments showed that there is little difference in retention by word position. He also showed that the direct object of an active verb is retained better than the subject of a passive verb. These experiments support the advice of the style books sug- gesting that writers should try to use active verbs wherever possible. The flag “—p” causes STYLE to print all sentences containing passive verbs. Pronouns add cohesiveness and connectivity to a document by providing back-reference. They are often a short-hand notation for something previously mentioned, and therefore connect 5-168 The Style and Diction Programs the sentence containing the pronoun with the word to which the pronoun refers. Although there are other mechanisms for such connections, documents with no pronouns tend to be wordy and to have little connectivity. Adverbs can provide transition between sentences and order in time and space. In performing these functions, adverbs, like pronouns, provide connectivity and cohesiveness. Conjunctions provide parallelism in a document by connecting two or more equal units. These units may be whole sentences, verb phrases, nouns, adjectives, or prepositional phrases. The compound and compound-complex sentences reported under sentence type are parallel structures. Other uses of parallel structures are indicated by the degree that the number of conjunctions reported under word usage exceeds the compound sentence measures. Nouns and Adjectives. A ratio of nouns to adjectives near unity may indicate the over-use of modifiers. technical writers qualify every noun with one or more adjectives. Some Qualifiers in phrases like “simple linear single-link network model” often lend more obscurity than precision to a text. Nominalizations are verbs that are changed to nouns by adding one of the suffixes “ment”, “ance”, “ence”, or “ion”’. tion. KExamples are accomplishment, admittance, adherence, and abbrevia- When a writer transforms a nominalized sentence to a non-nominalized sentence, she/he increases the effectiveness of the sentence in several ways. The noun becomes an active verb and frequently one complicated clause becomes two shorter clauses. For example, Their inclusion of this provision is admission of the importance of the system. When they included this provision, they admitted the importance of the system. Coleman found that the transformed sentences were easier to learn, even when the transformation produced sentences that were slightly longer, provided the transforma- tion broke one clause into two. Writers who find their document contains many nomi- nalizations may want to transform some of the sentences to use active verbs. 2.5. Sentence openers Another agreed upon principle of style is variety in sentence openers. Because STYLE determines the type of sentence opener by looking at the part of speech of the first word in the sentence, the sentences counted under the heading “subject opener” may not all really begin with the subject. However, a large percentage of sentences in this category still indi- cates lack of variety in sentence openers. Other sentence opener measures help the user determine if there are transitions between sentences and where the subordination occurs. Adverbs and conjunctions between sentences. at the beginning of sentences are mechanisms for transition A pronoun at the beginning shows a link to something previously men- tioned and indicates connectivity. The location of subordination can be determined by comparing the number of sentences that begin with a subordinator with the number of sentences with complex clauses. If few sentences start with subordinate conjunctions then the subordination is embedded or at the end of the complex sentences. For variety the writer may want to transform some sentences to have leading subordination. The last category of openers, expletives, is commonly overworked in technical writing. Expletives are the words “it” and “there”, usually with the verb “to be”, in constructions where the subject follows the verb. For example, The Style and Diction Programs 5-169 There are three streets used by the traffic. There are too many users on this system. This construction tends to emphasize the object rather than the subject of the sentence. The flag “—e” will cause STYLE to print all sentences that begin with an expletive. 3. DICTION The program DICTION prints all sentences in a document containing phrases that are either frequently misused or indicate wordiness. The program, an extension of Aho’s FGREP [12] string matching program, takes as input a file of phrases or patterns to be matched and a file of text to be searched. A data base of about 450 phrases has been compiled as a default pattern file for DICTION. Before attempting to locate phrases, the program maps upper case letters to lower case and substitutes blanks for punctuation. Sentence boundaries were deemed less critical in DICTION than in STYLE, so abbreviations and other uses of the character “.” are not treated specially. DICTION brackets all pattern matches in a sentence with the characters “[” “]” . Although many of the phrases in the default data base are correct in some contexts, in others they indicate wordiness. Some examples of the phrases and sug- gested alternatives are: Phrase Alternative a large number of many arrive at a decision decide collect together collect for this reason SO pertaining to about through the use of by or with utilize use with the exception of except Appendix 2 contains a complete list of the default file. problem phrases. Some of the entries are short forms of For example, the phrase “the fact” is found in all of the following and is sufficient to point out the wordiness to the user: Phrase Alternative accounted for by the fact that caused by an example of this is the fact that thus based on the fact that because despite the fact that although due to the fact that because in light of the fact that because in view of the fact that since notwithstanding the fact that although Entries in Appendix 2 preceded by “*” are not matched. Of’ See Section 7 for details on the use §6~99 The user may supply her/his own pattern file with the flag “—f patfile”. default file will be loaded first, followed by the user file. In this case the This mechanism allows users to suppress patterns contained in the default file or to include their own pet peeves that are not in the default file. The flag “—n” will exclude the default file altogether. In constructing a pattern file, blanks should be used before and after each phrase to avoid matching substrings in words. For example, to find all occurrences of the word “the”, the pattern “ the ” should be used. The blanks cause only the word “the” to be matched and not the string “the” in words like there, other, and therefore. One side effect of surrounding the words with blanks is that when two phrases occur without intervening words, only the first will be matched. 5-170 The Style and Diction Programs 4. EXPLAIN The last program, EXPLAIN, is an interactive thesaurus for phrases found by DICTION. The user types one of the phrases bracketed by DICTION and EXPLAIN responds with suggested substitutions for the phrase that will improve the diction of the document. Table 1 Text Statistics on 20 Technical Documents Readability sentence info. Kincaid automated 9.5 9.0 sentence openers 2.2 100 16.0 8.9 17.0 14.4 2.2 15.5 30.3 21.6 4.0 4.61 5.63 5.08 .29 av nonfunction length 5.72 7.30 6.52 45 short sent 23 % 46 % 33 % 5.9 long sent 7% 20% 14% 2.9 31% 71% 49% 11.4 19% 50% 33% 8.3 compound 2% 14 % 7% 3.3 compound-complex 2% 19% 10% 4.8 tobe 26 % 64 % 44.7% 10.3 auxiliary 10% 40% 21% infinitives 8% 24 % 15.1% 8.7 4.8 _12% 9.3 av sent length -~ simple | passives word usage 2.5 Cole-Liau complex verb types 13.3 13.3 Flesch av word length sentence types 16.9 17.4 12.7 1.8 50% 29 % prepositions 10.1% 15.0% 12.3% conjunction 1.8% 4.8% 3.4% adverbs 1.2% 5.0% 3.4% nouns 23.6 % 31.6% 27.8% 1.7 adjectives 15.4% 27.1% 21.1% 3.4 1.6 | 9 1.0 pronouns 1.2% 8.4% 2.5% 1.1 nominalizations 2% 5% 3.3% 8 prepositions 6% 19% 12% 3.4 adverbs 0% 20 % 9% 4.6 8.0 subject 56 % 85 % 70% verbs 0% 4% 1% 1.0 subordinating conj 1% 12% 5% 2.7 conjunctions 0% 4% 0% 1.5 expletives 0% 6% 2% 1.7 5. Results 5.1. STYLE To get baseline statistics and check the program’s accuracy, we ran STYLE on 20 technical documents. There were a total of 3287 sentences in the sample. The shortest document was 67 sentences long; the longest 339 sentences. The documents covered a wide range of subject matter, including theoretical computing, physics, psychology, engineering, and affirmative action. Table 1 gives the range, median, and standard deviation of the various style measures. As 'you will note most of the measurements have a fairly wide range of values across the sample documents. | As a comparison, Table 2 gives the median results for twe different technical authors, a sample of instructional material, and a sample of the Federalist Papers. The two authors The Style and Diction Programs 5-171 show similar styles, although author 2 uses somewhat shorter sentences and longer words than author 1. Author 1 uses all types of sentences, while author 2 prefers simple and complex sentences, using few compound or compound-complex sentences. The other major difference in the styles of these authors is the location of subordination. Author 1 seems to prefer embedded or trailing subordination, while author 2 begins many sentences with the subordinate clause. The documents tested for both authors 1 and 2 were technical documents, written for a technical audience. The instructional documents, which are written for craftspeople, vary surprisingly little from the two technical samples. The sentences and words are a little longer, and they contain many passive and auxiliary verbs, few adverbs, and almost no pronouns. The instructional documents contain many imperative sentences, so there are many sentence with verb openers. The sample of Federalist Papers contrasts with the other samples in almost every way. Table 2 Text Statistics on Single Authors variable readability sentence info sentence types verb type word usage sentence openers author 1 author 2 inst. FED Kincaid 11.0 10.3 10.8 16.3 automated 11.0 10.3 11.9 17.8 Coleman-Liau 9.3 10.1 10.2 12.3 Flesch 10.3 10.7 10.1 15.0 av sent length 22.64 19.61 22.78 31.85 av word length av nonfunction length 4.47 5.64 4.66 5.92 4.65 6.04 4.95 6.87 short sent 35% 43 % 35% 40% long sent 18% 15% 16 % 21% simple 36 % 43 % 40% 31% complex 34 % 41 % 37% 34% compound 13% 7% 4% 10% compound-complex 16 % 8% 14% 25% tobe 42 % 43 % 45 % 37% auxiliary 17% 19% 32% 32% infinitives 17% 15% 12% 21% passives 20% 19% 36 % 20% prepositions conjunctions 10.0% 3.2% 10.8% 2.4% 12.3% 3.9% 15.9% 3.4% adverbs 5.05% 4.6 % 3.6% 3.7% nouns 27.7% 26.5% 29.1% 24.9% adjectives 17.0% 19.0% 15.4% 124% pronouns 5.8% 4.3% 2.1% 6.5% nominalizations 1% 2% 2% 3% 11% 14% 6% 5% prepositions adverbs 9% 9% 6% 4% subject 65 % 59 % 54 % 66 % verb 3% 2% 14 % 2% subordinating conj 8% 14% 11% 3% conjunction 1% 0% 0% 3% expletives 3% 3% 0% 3% 5.2. DICTION In the few weeks that DICTION has been available to users about 35,000 sentences have been run with about 5,000 string matches. The authors using the program seem to make the suggested changes about 50-75% of the time. To date, almost 200 of the 450 strings in the 5-172 The Style and Diction Programs default file have been matched. Although most of these phrases are valid and correct in some contexts, the 50-756% change rate seems to show that the phrases are used much more often than concise diction warrants. | 6. Accuracy 6.1. Sentence Identification The correctness of the STYLE output on the 20 document sample was checked in detail. STYLE misidentified 129 sentence fragments as sentences and incorrectly joined two or more sentences 75 times in the 3287 sentence sample. The problems were usually because of non- standard formatting commands, unknown abbreviations, or lists of non-sentences. An impos- sibly long sentence found as the longest sentence in the document usually is the result of a long list of non-sentences. 6.2. Sentence Types Style correctly identified sentence type on 86.5% of the sentences in the sample. The type distribution of the sentences was 52.5% simple, 29.9% complex, 8.5% compound and 9% compound-complex. The program reported 49.5% simple, 31.9% complex, 8% compound and 10.4% compound-complex. Looking at the errors on the individual documents, the number of simple sentences was under-reported by about 4% and the complex and compound-complex were over-reported by 3% and 2%, respectively. The following matrix shows the programs output vs. the actual sentence type. Program Results simple complex compound Actual simple 1566 132 49 comp-complex 17 Sentence complex 47 892 6 65 Type compound 40 6 207 23 comp-complex 0 52 5 249 The system’s inability to find imperative sentences seems to have little effect on most of the style statistics. A document with half of its sentences imperative was run, with and without the imperative end marker. The results were identical except for the expected errors of not finding verbs as sentence openers, not counting the imperative sentences, and a slight difference (1%) in the number of nouns and adjectives reported. 6.3. Word Usage The accuracy of identifying word types reflects that of PARTS, which is about 95% correct. The largest source of confusion is between nouns and adjectives. The verb counts were checked on about 20 sentences from each document and found to be about 98% correct. 7. Technical Details 7.1. Finding Sentences The formatting commands embedded in the text increase the difficulty of finding sentences. Not all text in a document is in sentence form; there are headings, tables, equations and lists, for example. Headings like “Finding Sentences” above should be discarded, not attached to the next sentence. However, since many of the documents are formatted to be phototypeset, and contain font changes, which usually operate on the most important words in the document, discarding all formatting commands is not correct. To improve the programs’ ability to find sentence boundaries, the deformatting program, DEROFF [13], has been given some knowledge of the formatting packages used on the UNIX operating system. will now do the following: DEROFF 5-173 Suppress all formatting macros that are used for titles, headings, author’s name, etc. L Suppress the arguments to the macros for titles, headings, author’s name, etc. Suppress displays, tables, footnotes and text that is centered or in no-fill mode. > b The Style and Diction Programs Substitute a place holder for equations and check for hidden end markers. The place holder is necessary because many typists and authors use the equation setter to change fonts on important words. For this reason, header files containing the definition of the EQN delimiters must also be included as input to STYLE. End markers are often hidden when an equation ends a sentence and the period is typed inside the EQN delimiters. 5. Add a ”.” after lists. If the flag —ml is also used, all lists are suppressed. This is a separate flag because of the variety of ways the list macros are used. Often, lists are sentences that should be included in the analysis. The user must determine how lists are used in the document to be analyzed. - Both STYLE and DICTION call DEROFF before they look at the text. The user should supply the —ml flag if the document contains many lists of non-sentences that should be skipped. 7.2. Details of DICTION The program DICTION is based on the string matching program FGREP. FGREP takes as input a file of patterns to be matched and a file to be searched and outputs each line that contains any of the patterns with no indication of which pattern was matched. The following changes have been added to FGREP: 1. The basic unit that DICTION operates on is a sentence rather than a line. Each sentence that contains one of the patterns is output. Upper case letters are mapped to lower case. Punctuation is replaced by blanks. All pattern matches in the sentence are found and surrounded with “[” “]” . A method for suppressing a string match has been added. Any pattern that begins with “» will not be matched. Because the matching algorithm finds the longest substring, the suppression of a match allows words in some correct contexts not to be matched while allowing the word in another context to be found. For example, the word “which” is often incorrectly used instead of “that” in restrictive clauses. However, “which” is usually correct when preceded by a preposition or “,”. The default pattern file suppresses the match of the common prepositions or a double blank followed by “which” and therefore matches only the suspect uses. The double blank accounts for the replaced comma. 8. Conclusions A system of writing tools that measure some of the objective characteristics of writing style has been developed. The tools are sufficiently general that they may be applied to documents on any subject with equal accuracy. Although the measurements are only of the surface structure of the text, they do point out problem areas. In addition to helping writers produce better documents, these programs may be useful for studying the writing process and finding other formulae for measuring readability. 5-174 The Style and Diction Programs References 1. L. L. Cherry, “PARTS - A System for Assigning Word Classes to English Text,” submit- ted Communications of the ACM. B. W. Kernighan and J. R. Mashey, “The UNIX Programming Environment,” Software — Practice & Experience , 9, 1-15 (1979). G. R. Klare, “Assessing Readability,” Reading Research Quarterly, 1974-1975, 10 , 62102. E. A. Smith and P. Kincaid, “Derivation and validation of the automated readability index for use with technical materials,” Human Factors, 1970, 12, 457-464. J. P. Kincaid, R. P. Fishburne, R. L. Rogers, and B. S. Chissom, “Derivation of new rea- dability formulas (Automated Readability Index, Fog count, and Flesch Reading Ease Formula) for Navy enlisted personnel,” Navy Training Command Research Branch Report 8-75, Feb., 1975. | M. Coleman and T. L. Liau, “A Computer Readability Formula Designed for Machine Scoring,” Journal of Applied Psychology, 1975, 60, 283-284. R. Flesch, “A New Readability Yardstick,” Journal of Applied Psychology, 1948, 32, 221-233. E. U. Coke, private communication. D. W. Ewing, Writing for Results, John Wiley & Sons, Inc., New York, N. Y. (1974). 10. G. Leggett, C. D. Mead and W. Charvat, Prentice-Hall Handbook for Writers, Seventh Edition, Prentice-Hall Inc., Englewood Cliffs, N. J. {1978). 11. E. B. Coleman, “Learning of Prose Written in Four Grammatical Transformations,” Journal of Applied Psychology, 1965, vol. 49, no. 5, pp. 332-341. 12 A. V. Aho and M. J. Corasick, “Efficient String Matching: an aid to Bibliographic 13. Bell Laboratories, Search,” Communications of the ACM, 18, (6), 333-340, June 1975. “UNIX TIME-SHARING SYSTEM: MANUAL,” Seventh Edition, Vol. 1 (January 1979). UNIX PROGRAMMER’S The Style and Diction Programs Appendix 1 STYLE Abbreviations a. d. A. M. a. m. b. c. Ch. ch. ckts. dB. Dept. dept. Depts. depts. Dr. Drs. e. g. eq. et al. etc. Fig. Figs. figs. ft. 1. e. in. Inc. Jr. jr. mi. Mr. Mrs. Ms. No. no. Nos. nos. P. M. p. m. Ph. D. Ph. d. Ref. ref. Refs. refs. St. Vs. yr. 5-175 5-176 The Style and Diction Programs Appendix 2 Default DICTION Patterns a great deal of center portion fearful that in the form of a large number of check into few in number in the instance of a lot of check on file away in the interim a majority of check up on final completion in the last analysis a need for circle around final ending in the matter of a number of close proximity final outcome in the near future a particular preference for collaborate together final result in the neighborhood of a preference for collect together finalize in the not too distant future a small number of combine together find it interesting to know in the proximity of in the range of a tendency to come to an end first. and foremost abovementioned commence first beginnings in the same way as described absolutely complete common accord first initiated in the shape of absolutely essential compensation firstly in the vicinity of accomplished completely eliminated follow after in this case accordingly comprise following after in view of the activate concerning for the purpose of in violation of actual conduct an investigation of for the reason that inasmuch as added increments conjecture for the simple reason that indicate adequate enough connectl up for this reason indicative of advent consensus of opinion for your information initialize afford an opportunity consequent result from the point of view of initiate aggregate consolidate together full and complete injurious to all of construct generally agreed inquire all throughout contemplate good and inside of along the line continue on got to institute a intents and purposes an indication of continue to remain gratuitous analyzation could of greatly minimize intermingle and etc count up head up irregardless and or couple together help but is defined as another additional debate about helps in the production of is used to control any and all decide on hopeful 1s when arrive at a deleterious effect if and when is where it 1s incumbent as a matter of fact demean if at all possible as a method of demonstrate impact 1t stands to reason as good or better than implement 1t was noted that if as per deserving of important essentials joint cooperation desirable benefits importantly joint partnership as regards desirous of in a large measure just exactly as related to different than bt o depreciate in value as of now kind of as to discontinue in accordance know about assistance disutility in advance of last but not least assistance to divide up n a position to later on doubt but in all cases leaving out of consideration due to in back of liable link up at a later date duly noted in behalf of at about during the time that in behind literally at above each and every in between little doubt that at all times early beginnings in case lose out on at an early date effectuate in close proximity lots of at below emotional feelings in conflict with main essentials at the present empty out in conjunction with make a at the time when enclosed herein E. in agreement with assistance to assuming that at this point in time enclosed herewith in fact connection with make adjustments to make an end result large measure end up many cases make contact with at your earliest convenience endeavor most cases make mention of authorization enter in awful enter into make application to my opinion I think make out a list of order to make the acquaintance of basic fundamentals enthused rare cases make the adjustment basically entirely complete reference to manner be cognizant of equally good as regard to maximum possible being as essentially regards to meaningful being that eventuate relation with brief in duration every now and then [ E. 5 at this time at which time bring to a conclusion exactly identical in size melt up but that experiencing ditliculty in terms of methodology but what fabricate in the amount of might of by means of face up to in the case of minimize as far as possible by the use of facilitate in the course of minor importance carry out experiments facts and figures in the event miss out on center about fast in action in the field of modification center around fearful of n short supply meet up with melt down The Style and Diction Programs 5-177 more preferable seems apparent worth while most unique send a communication would of must of short space of time ing behavior mutual cooperation should of wise necessary requisite single unit ~ which necessitate situation ~ about which need for so as to ~ after which nice sort of ~ at which not he un spell out ~ between which not in a position to still continue ~ by which not of a high order of accuracy still remain " for which not un subsequent ~ from which notwithstanding substantially in agreement ~in which of considerable magnitude succeed in ~ into which of that suggestive of ~ of which of the opinion that superior than ~ on which off of surrounding circumstances ~ on which on a few occasions take appropriate ~ over which on account of take cognizance of " through which on behalf of take into consideration " to which on the grounds that termed as ~ under which on the occasion terminate ~ upon which on the part of termination ~ with which one of the the author ~ without which open up the authors “clockwise operates to correct the case that “likewise outside of the fact “otherwise over with the foregoing overall the foreseeable future past history the fullest possible extent perceptive of the majority of perform a measurement the nature perform the measurement the necessity of permits the reduction of the only difference bheing that personalize the order of pertaining to the point that physical size the truth is plan ahead there are not many plan for the future through the medium of plan in advance through the use of plan on throughout the entire present a conclusion time interval present a report to summarize the above presently total effect of all this prior to totality prioritize transpire proceed to true facts procure try and productive of ultimate end prolong the duration under a separate cover protrude out from under date of provided that under separate cover pursuant to under the necessity to put to use in underlying purpose range all the way from undertake a study reason is because uniformly consistent reason why unique recur again until such time as reduce down up to this time refer back upshot reference to this utilize reflective of very regarding very complete regretful very unigue reinitiate vital relative to which repeat again with a view to representative of with reference to resultant etfect with regard to resume again with the exception of retreat back with the object of return again with the result that return back with this in mind, it is clear that revert back within the realm of possibility seal off without further delay Introduction 6-1 PART 6: MISCELLANEOUS This part contains articles you may find helpful on unsupported software. Learn The article on Learn, by Kernighan and Lesk, tells how you can create and use computeraided-instruction (CAI) courses. Read “LEARN - Computer-Aided Instruction on UNIX” if you plan to develop CAI courses. This article is not for people new to ULTRIX-32 or those who want help in using a CAI course that has already been developed. The Learn utility is available on ULTRIX-32, but it is not supported. Rogue When you feel comfortable with the ULTRIX-32 system, you may want to play Rogue. “A Guide to the Dungeons of Doom” is the first step on an adventure that will test your courage and intuition. With the help of the guide, you may be able to return from the dungeons of doom. Rogue and a variety of other games are available on the ULTRIX-32 system, but they are not supported. Berkeley Fonts The “Berkeley Font Catalogue” shows sample raster fonts developed at Berkeley. These fonts are available on the ULTRIX-32 system, but are not supported. PDP-11 Assembler The “UNIX Assembler Reference Manual” included in this part describes the assembly language for the UNIX system that runs on the PDP-11. The PDP-11 assembler is not available on the ULTRIX-32 system. Learn 6-3 LEARN — Computer-Aided Instruction on UNIX (Second Edition) Brian W. Kernighan Michael E. Lesk Bell Laboratories Murray Hill, New Jersey 07974 1. Introduction. Learn is a driver for CAIl scripts. It is intended to permit the easy composition of lessons and lesson fragments to teach people computer skills. Since it is teaching the same system on which it is implemented, it makes direct use of UNIXT facilities to create a controlled UNIX environment. The system includes two main parts: (1) a driver that interprets the lesson scripts; and (2) the lesson scripts themselves. At present there are six scripts: - basic file handling commands = the UNIX text editor ed - advanced file handling - the egn language for typing mathematics - the ‘*—ms’’ macro package for document formatting - the C programming language The purported advantages of CAl scripts for training in computer skills include the following: (a) students are forced to perform the exercises that .re in fact the vasis of training in any case; (b) students receive immediate feedback and confirmation of progress; (¢) students may progress at their own rate; (d) (e) no schedule requirements are imposed; students may study at any time convenient for them; the lessons may be improved individually and the improvements are immediately available to new users; (f) since the student has access to a computer for the CAIl script there is a place to do exercises; (g) the use of high technology will improve student motivation and the interest of their management. Opposed to this, of course, is the absence of anyone to whom the student may direct questions. If CAl is used without a ‘‘counselor’’ or other assistance, it should properly be compared to a textbook, lecture series, or taped course, rather than to a seminar. CAIl has been used for many years in a variety of educational areas.!-2.3 The use of a computer to teach itself, how- ever, offers unique advantages. The skills developed to get through the script are exactly those needed to use the computer; there is no waste effort. 111€ sciipis writien so {ar arc based on somc {amiliar assumpticns tUNIX is 2 Trademark of Beil Laboratories. about sducation; these 6-4 Learn assumptions are outlined in the next section. The remaining sections describe the operation of the script driver and the particular scripts now available. The driver puts few restrictions on the script writer. but the current scripts are of a rather rigid and stereotyped form in accordance with the theory in the next section and practical limitations. 2. Educational Assumptions and Design. First. the way to teach people how to do something is to have them do it. Scripts should not contain long pieces of explanation; they should instead frequently ask the student to do some task. So teaching is always by example: the typical script fragment shows a small example of some technique and then asks the user to either repeat that example or produce a variation on it. All are intended to be easy enough that most students will get most questions right, rein- forcing the desired behavior. Most lessons fall into one of three types. yes or no answer (0 a question. The simplest presents a lesson and asks for a The student is given a chance to experiment before replying. The script checks for the correct reply. Problems of this form are sparingly used. The second type asks for a word or number as an answer. might say For example a lesson on files | How many files are there in the current directory? Type ‘‘answer N'', where N is the number offiles. The student is expected to respond (perhaps after experimenting) with answer 17 or whatever. Surprisingly often, however, the idea of a substitutable argument (i.e., replacing N by 17) is difficult for non-programmer students, so the first few such lessons need real care. The third type of lesson is open-ended — a task is set for the student, appropriate parts of the input or output are monitored, and the student types ready when the task is done. Figure | shows a sample dialog that illustrates the last of these, using two lessons about the car (concatenate, i.e.. print) command taken from early in the script that teaches file handling. Most learn lessons are of this form. After each correct response the computer congratulates the student and indicates the lesson number that has just been completed, permitting the student to restart the script after that lesson. If the answer is wrong, the student is offered a chance to repeat the lesson. The “*speed’’ rating of the student (explained in section 5) is given after the lesson number when the lesson is completed successfully; it is printed only for the aid of script authors checking out possible errors in the lessons. It is assumed that there is no foolproof way to determine if the student truly ‘‘understands’’ what he or she is doing. accordingly, the current learn scripts only measure performance, not comprehension. ““learning.”TM* If the student can perform a given task., that is deemed to be | The main point of using the computer is that what the student does is checked for correctness immediately. Unlike many CAI scripts, however, these scripts provide few facilities for dealing with wrong answers. In practice, if most of the answers are not right the script is a failure: the universal solution to student error is to provide a new, easier script. Anticipating possible wrong answers is an endless job, and it is really easier as well as better to provide a simpler script. Along with this goes the assumption that anything can be taught to a ybody if it can be breken into sufficiently small piscee, vided. Anything not ahsorbed in a single chunk ie just suhdi. To avoid boring the faster students, however, an effort is made in the files and editor scripts to provide three tracks of different difficulty. The fastest sequence of lessons is aimed at roughly the bulk and speed of a typical tutorial manual and should be adequate for review and for well-prepared students. The next track is intended for most users and is roughly twice as Learn 6-5 Figure 1. Sample dialog from basic files script (Student responses in italics; *S° is the prompt) A file can be printed on vour terminal by using the "cat” command. Just say "cat file” where "file" is the file name. For example, there is a file named "food” in this directory. List it by saying "cat food"; then type "ready”. $ cat food this is the file named food. S ready Good. Lesson 3.3a (1) Of course. you can print any file with “cat”. In particular, it is common to first use "1s" to find the name of a file and then "cat” to print it. Note the difference between "1s”, which tells you the name of the file, and “cat”, which tells you the contents. One file in the current directory is named for a President. Print the file, then type “ready”. $ cat President cat: can't open President $ ready Sorry, that’s not right. Do you want to try again? yes Try the problem again. Sis -OCOpY X1 roosevelt $ cat roosevelt this file is named roosevelt and contains three lines of text. $ ready Good. Lesson 3.3b (0) The "cat” command can also print several files at once. In fact, it is named "cat” as an abbreviation for “concatenate”.... long. Typically, for example. the fast track might present an idea and ask for a variation on the example snown; ine normal wrack wiii first ask the student to repeat (ne exampie (nai was shown before attempting a variation. The third and slowest track, which is often three or four times the length of the fast track. is intended to be adequate for anyone. (The lessons of Fig- ure | are from the third track.) The multipie tracks also mean that a student repeating a course is unlikely to hit the same series of lessons: this makes it profitable for a shaky user to back up 6-6 Learn and try again, and many students have done so. The tracks are not completely distinct, however. Depending on the number of correct answers the student has given for the last few lessons, the program may switch tracks. The driver is actually capable of following an arbitrary directed graph of lesson sequences, as dis- cussed.in section 5. Some more structured arrangement, however, is used in all current scripts to aid the script writer in organizing the material into lessons. It is sufficiently difficult to write lessons that the three-track theory is not followed very closely except in the files and editor scripts. Accordingly, in some cases, the fast track is produced merely by skipping lessons from the slower track. In others, there is essentially only one track. The main reason for using the /earn program rather than simply writing the same material as a workbook is not the selection of tracks, but actual hands-on experience. Learning by doing is much more effective than pencil and paper exercises. | Learn also provides a mechanical check on performance. The first version in fact would not let the student proceed unless it received correct answers to the questions it set and it would not tell a student the right answer. This somewhat Draconian approach has been moderated in version 2. Lessons are sometimes badly worded or even just plain wrong: in such cases, the student has no recourse. But if a student is simply unable to complete one lesson, that should not prevent access to the rest. Accordingly, the current version of learn allows the student to skip a lesson that he cannot pass; a “‘no’’ answer to the **Do you want to try again?”’ question in Figure 1 will pass to the next lesson. It is still true that learn will not tell the student the right answer. - Of course, there are valid objections to the assumptions above. In particular, some stu- dents may object to not understanding what they are doing; and the procedure of smashing everything into small pieces may provoke the retort ‘‘you can't cross a ditch in two jumps.”’ Since writing CAI scripts is considerably more tedious than ordinary manuals, however, it is safe to assume that there will always be alternatives. to the scripts as a way of learning. In fact, for a reference manual of 3 or 4 pages it would not be surprising to have a tutorial manual of 20 pages and a (multi-track) script of 100 pages. Thus the reference manual will exist long before the scripts. 3. Secripts. As mentioned above, the present scripts try at most to follow a three-track theory. Thus little of the potential complexity of the possible directed graph is employed, since care must be taken in lesson construction to see that every necessary fact is presented in every possible path through the units. with student errors. In addition, it is desirable that every unit have alternate successors to deal In most existing courses, the first few lessons are devoted to checking prerequisites. For example, before the student is allowed to proceed through the editor script the script verifies that the student understands files and is able to type. It is felt that the sooner lack of student preparation is detected, the easier it will be on the student. Anyone preceeding through the scripts should be getting mostly correct answers; otherwise, the system will be unsatisfactory both because the wrong habits are being learned and because the scripts make little effort to deal with wrong answers. Unprepared students should not be encouraged to continue with SCripts. tried. There are some preliminary items which the student must know before any scripts can be In particular, the student must know how to connect to a UNIX system, set the terminal properly, log in, and execute simple commands (e.g., learn itseif). In addition, the character erase and line kill conventions (# and @) should be known. It is hard to see how this much could be taught by computer-aided instruction, since a student who does not know these basic skills will not be able to run the learning program. A brief description on paper is provided (see Appendix A), although assistance will be needed for the first few minutes. tance, however, need not be highly skilled. This assis- Learn 6-7 The first script in the current set deals with files. It assumes the basic knowledge above and teaches the student about the /s, cat. mv, rm, ¢p and diff commands. It also deals with the abbreviation characters ®, ?, and [ ] in file names. It does not cover pipes or I/0O redirec- tion, nor does it present the many options on the /s command. This script contains 31 lessons in the fast track: two are intended as prerequisite checks, seven are review exercises. There are a total of 75 lessons in all three tracks, and the instructional passages typed at the student to begin each lesson total 4,476 words. The average lesson thus begins with a §0-word message. In general, the fast track lessons have somewhat longer introductions, and the slow tracks somewhat shorter ones. The longest message is 144 words and the shortest 14. The second script trains students in the use of the context editor ed, a sophisticated editor using regular expressions for searching.’ All editor features except encryption, mark names and ‘:* in addressing are covered. The fast track contains 2 prerequisite checks, 93 lessons, and a review lesson. It is supplemented by 146 additional lessons in other tracks. A comparison of sizes may be of interest. The ed description in the reference manual is 2,572 words long. The ed tutorial® is 6,138 words long. The fast track through the ed script is 7.407 words of explanatory messages, and the total ed script, 242 lessons, has 15,615 words. The average ed lesson is thus also about 60 words; the largest is 171 words and the smallest 10. The original ed script represents about three man-weeks of effort. The advanced file handling script deals with /s options, 1/0 diversion, pipes, and supporting programs like pr, we, rail, spell and grep. (The basic file handling script is a prerequisite.) It is not as refined as the first two scripts; this is reflected at least partly in the fact that it pro- vides much less of a full three-track sequence than they do. On the other hand, since it is perceived as ‘‘advanced,” it is hoped that the student will have somewhat more sophistication and be better able to cope with it at a reasonably high level of performance. A fourth script covers the eqgn language for typing mathematics. This script must be run on a terminal capable of printing mathematics. for instance the DASI 300 and similar Diablobased terminals, or the nearly extinct Model 37 teletype. Again, this script is relatively short of tracks: of 76 lessons, only 17 are in the second track and 2 in the third track. Most of these provide additional practice for students who are having trouble in the first track. The =ms script for formatting macros is a short one-track only script. The macro package it describes is no longer the standard, so this script will undoubtedly be superseded in the future. Furthermore, the linear style of a single learn script is somewhat inappropriate for the macros, since the macro package is composed of many independent features, and few users need all of them. [t would be better to have a selection of short lesson sequences dealing with the features independently. The script on C is in a state of transition. It was originally designed to follow a tutorial on C, but that document has since become obsolete. The current script has been partially con- verted to follow the order of presentation in The C Programming Language,” but this job is not complete. The C script was never intended to teach C; rather it is supposed to be a series of exercises for which the computer provides checking and (upon success) a suggested solution. This combination of scripts covers much of the material which any user will need to know to make effective use of the UNIX system. With enlargement of the advanced files course to include more on the command interpreter, there will be a relatively compiete introduction to UNIX available via /earn. Although we make no pretense that learn will replace other instructional materials, it should provide a useful supplement to existing tutorials and reference manuals. 6-8 Learn 4. Experience with Students. Learn has been installed on many different UNIX systems. Most of the usage is on the first two scripts, so these are more thoroughly debugged and polished. As a (random) sample of user experience, the learn program has been used at Bell Labs at Indian Hill for 10,500 lessons in a four month period. About 3600 of these are in the files script, 4100 in the editor, and 1400 in advanced files. The passing rate is about 80%, that is, about 4 lessons are passed for everv one failed. There have been 86 distinct users of the files script, and 58 of the editor. On our systemn at Murray Hill, there have been nearly 4000 lessciis over four weeks that include Christmas and New Year. Users have ranged in age from six up. It is difficult to characterize typical sessions with the scripts; many instances exist of someone doing one or two lessons and then logging out, as do instances of someone pausing in a script for twenty minutes or more. In the earlier version of learn, the average session in the files course took 32 minutes and covered 23 lessons. The distribution is quite broad and skewed, however; the longest session was 130 minutes and there were five sessions shorter than five minutes. The average lesson took about 80 seconds. These numbers are roughly typical for non-programmers; a UNIX expert can do the scripts at approximately 30 seconds per lesson, most of which is the system printing. At present working through a section of the middle of the files script took about 1.4 seconds of processor time per lesson, and a system expert typing quickly took 15 seconds of real time per lesson. A novice would probably take at least a minute. Thus, as a rough approximation, a UNIX system could support ten students working simultaneously with some spare capacity. §. The Script Interpreter. The learn program itself merely interprets scripts. It provides facilities for the script writer to capture student responses and their effects, and simplifies the job of passing control to and recovering control from the student. This section describes the operation and usage of the driver program, and indicates what is required to produce a new script. Readers only interested in the existing scripts may skip this section. The file structure used by learn is shown in Figure 2. There is one parent directory (named /ib) containing the script data. Within this directory are subdirectories, one for each subject in which a course is available, one for logging (named log), and one in which user subdirectories are created (named play). The subject directory contains master copies of all lessons, plus any supporting material for that subject. In a given subdirectory, each lesson is a single text file. Lessons are usually named systematically; the file that contains lesson n is called Ln. When learn is executed, it makes a private directory for the user to work in, within the learn portion of the file system. A fresh copy of all the files used in each lesson (mostly data for the student to operate upon) is made each time a student starts a lesson, so the script writer may assume that everything is reinitialized each time a lesson is entered. The student directory is deleted after each session:; any permanent records must be kept elsewhere. The script writer must provide certain basic items in each lesson: (1) the text of the lesson; (2) the set-up commands to be executed before the user gets control; (3) the data, if any, which the user is supposed to edit, transform, or otherwise process:; (4) the evaluating commands to be executed after the user has finished the lesson, to decide whether the answer is right; and (5) alist of possible successor lessons. Learn tries to minimize the work of bookkeeping and installation, so that most of the effort involved in script production is in planning lessons, writing tutorial paragraphs, and coding tests of student performance. Learn 6-9 Figure 2: Directory structure for learn lib play student | | files for studentl... student? files for student?... files LO.1a LO.1b lessons for files course editor (other courses) log The basic sequence of events is as follows. First, learn creates the working directory. Then, for each lesson, learn reads the script for the lesson and processes it a line at a time. The lines in the script are: (1) commands to the script interpreter to print something, to create a files, to test something, etc.; (2) text to be printed or put in a file; (3) other lines, which are sent to the shell to be executed. One line in each lesson turns control over to the user; the user can run any UNIX commands. The user mode terminates when the user types yes, no, ready, or answer. At this point, the user’s work is tested; if the lesson is passed, a new lesson is selected, and if not the old one is repeated. Let us illustrate this with the script for the second lesson of Figure l: this is shown in Figure 3. Lines which begin with # are commands to the /earn script interpreter. For example, #print causes printing of any text that follows, up to the next line that begins with a sharp. #print file prints the contents of file; it is the same as car file but has less overhead. Both forms of #prine have the added property that if a lesson is failed, the #prinr will not be executed the second time through; this avoids annoying the student by repeating the preambie to a lesson. #create filename creates a flle of the specified name, and copies any subsequent text up to a # to the file. This is used for creating and initializing working files and reference data for the lessons. #user gives control to the student; each line he or she types is passed to the shell for execution. The #user mode is terminated when the student types one of yes, no, ready or answer. At that time, the driver resumes interpretation of the script. | Hennyin #uncopyin Anything the student types between these commands is copied onto a file called .copyv. This lets the script writer interrogate the student’s responses upon regaining control. 6-10 Learn Figure 3: Sample Lesson #Fprint Of course, you can print any file with "cat”. In particular, it is common to first use "Is" to find the name of a file and then "cat” to print it. Note the difference between "Is”, which tells you the name of the files, and “cat”, which telils you the contents. One file in the current directory is named for a President. Print the file, then type "ready”. #create roosevelt this file is named roosevelt and contains three lines of text. #copyout #user #uncopyout tail =2 .ocopy >X1 #emp X1 roosevelt #log #next 3.2b 2 #copyout #uncopyout Between these commands, any material typed at the student by any program is copied to the file .ocopy. This lets the script writer interrogate the effect of what the student typed, which true believers in the performance theory of learning usually prefer to the student’s actual input. #pipe #unpipe Normally the student input and the script commands are fed to the UNIX command interpreter (the **shell’’) one line at a time. This won't do if, for example, a sequence of editor commands is provided, since the input to the editor must be handed to the editor, not to the shell. Accordingly, the material between #pipe and #unpipe commands is fed continuously through a pipe so that such sequences work. If copyour is also desired the copyour brackets must include the pipe brackets. There are several commands for setting status after the student has attempted the lesson. #cemp file! file2 is an in-line implementation of cmp, which compares two files for identity. #march stuff The last line of the student’s input is compared to stuff. and the success or fail status is set according to it. Extraneous things like the word answer are stripped before the comparison is made. There may be several #march lines; this provides a convenient mechanism for handling multiple *‘right’’ answers. Any text up to a # on subsequent lines after a successful #march is printed. this is illustrated in Figure 4, another sample lesson. #bad stuff This is similar to #march. except that it corresponds to specific failure answers: this can be used to produce hints for particular wrong answers that have been anticipated by the script - = e BoWS Learn 6-11 Figure 4: Another Sample Lesson #print What command will move the current line to the end of the file? Type "answer COMMAND", where COMMAND is the command. #copyin #user #uncopyin #match mS #match .mS$ "m$" is easier. #log #Fnext 63.1d 10 writer. #succeed #/1ail print a message upon success or failure (as determined by some previous mechanism). When the student types one of the ‘‘commands’ yes, no, ready, or answer, the driver terminates the 7user command, and evaluation of the student’s work can begin. This can be done either by the built-in commands above, such as #match and #cmp, or by status returned by normal UNIX commands, typically grep and test. The last command should return status true (0) if the task was done successfully and false (non-zero) otherwise; this status return tells the driver whether or not the student has successfully passed the lesson. Performance can be logged: #log file writes the date, lesson, user name and speed rating, and a success/failure indication on file. The command #log by itself writes the logging information in the logging directory within the learn hierarchy, and is the normal form. #next is followed by a few lines, each with a successor lesson name and an optional speed rating on it. A typical set might read 25.1a 25.2a 25.3a 10 § 2 indicating that unit 25.1a is a suitable follow-on lesson for students with a speed rating of 10 units, 25.2a for student with speed near 5, and 25.3a for speed near 2. Speed ratings are maintained for each session with a student; the rating is increased by one each time the student gets a lesson right and decreased by four each time the student gets a lesson wrong. Thus the driver tries to maintain a level such that the users get 80% right answers. The maximum rating is lim- ited ¢ 10 and the minimum to 0. The initial rating ic zero unless the student specifies a different rating when starting a session. If the student passes a lesson, a new lesson is selected and the process repeats. If the stu- dent fails, a false status is returned and the program reverts to the previous lesson and tries 6-12 Learn another alternative. If it can not find another alternative, it skips forward a lesson. The student can terminate a session at any time by typing bye, which causes a graceful exit from learn. Hanging up is the usual novice's way out. The lessons may form an arbitrary directed graph, although the present program imposes a limitation on cycles in that it will not present a lesson twice in the same session. If the student is unable to answer one of the exercises correctly, the driver searches for a previous lesson with a set of alternatives as successors (following the #nexr line). From the previous lesson with alternatives one route was taken earlier; the program simply tries a different one. It is perfectly possible to write sophisticated scripts that evaluate the student’s speed of response, or try to estimate the elegance of the answer, or provide detailed analysis of wrong answers. Lesson writing is so tedious already, however, that most of these abilities are likely to g0 unused. The driver program depends heavily on features of the UNIX system that are not available on many other operating systems. These include the ease of manipulating files and directories, file redirection, the ability to use the command interpreter as just another program (even in a pipeline), command status testing and branching, the ability to catch signals like interrupts, and of course the pipeline mechanism itself. Although some parts of learn might be transferable to other systems, some generality will probably be lost. A bit of history: The first version of /earn had fewer built-in commands in the driver program, and made more use of the facilities of the UNIX system itself. For example, file comparison was done by creating a c¢mp process, rather than comparing the two files within learn. Lessons were not stored as text files, but as archives. There was no concept of the in-line document; even #print had to be followed by a file name. Thus the initialization for each lesson was to extract the archive into the working directory (typically 4-8 files), then #print the lesson text. The combination of such things made /earn rather slow and demanding of system resources. The new version is about 4 or 5§ times faster, because fewer files and processes are created. Furthermore, it appears even faster to the user because in a typical lesson, the printing of the message comes first, and file setup with #creare can be overlapped with printing, so that when the program finishes printing, it is really ready for the user to type at it. It is also a great advantage to the script maintainer that lessons are now just ordinary text files, rather than archives. They can be edited without any difficuity, and UNIX text manipulation tools can be applied to them. The result has been that there is much less resistance to going in and fixing substandard lessons. 6. Conclusions The following observations can programmers who have used /earn: (a) be made about secretaries, typists, and other non- A novice must have assistance with the mechanics of communicating with the computer to get through to the first lesson or two; once the first few lessons are passed peopie can proceed on their own. (b) The terminology used in the first few lessons is obscure to those inexperienced with com- (¢) The concept of ‘‘substitutable argument’’ is hard to grasp, and requires help. d) Thay 2nigy the system for the most part. Motivatinn matters a great deal, however, puters. It would help if there were a low level reference card for UNIX to supplement the existing programmier oriented bulky manual and bulky reference card. It takes an hour or two for a novice to get through the script on file handling. The total time for a reasonably intelligent and motivated novice to proceed from ignorance to a reasonable ability to create new files and manipulate old ones seems to be a few days, with perhaps half of each day spent on the machine. Learn 6-13 The normal way of proceeding has been to have students in the same room with someone who knows the UNIX system and the scripts. difficuit questions. Thus the student is not brought to a halt by The burden on the counselor, however, is much lower than that on a teacher of a course. Ideally, the students should be encouraged to proceed with instruction immediately prior to their actual use of the computer. They should exercise the scripts on the same computer and the same kind of terminal that they will later use for their real work, and their first few jobs for the computer should be relatively easy ones. Also. both training and initial work should take place on days when the hardware and software are working reliably. Rarely is all of this possible, but the closer one comes the better the result. For example, if it is known that the hardware is shaky one day, it is better to attempt to reschedule training for another one. Students are very frustrated by machine downtime: when nothing is happening, it takes some sophistication and experience to distinguish an infinite loop, a slow but functioning program, a program waiting for the user, and a broken machine. One disadvantage of training with /earn is that students come to depend completely on the CAl system, and do not try to read manuals or use other learning aids. This is unfortunate, not only because of the increased demands for completeness and accuracy of the scripts, but because the scripts do not cover all of the UNIX system. New users should have manuals (appropriate for their level) and read them; the scripts ought to be altered to recommend suit- able documents and urge students to read them. There are several other difficulties which are clearly evident. From the student’s viewpoint, the most serious is that lessons still crop up which simply can't be passed. Sometimes this is due to poor explanations, but just as often it is some error in the lesson itseif a botched setup, a missing file, an invalid test for correctness, or some systemn facility that doesn’t work on the local system in the same way it did on the development system. It takes knowledge and a certain healthy arrogance on the part of the user to recognize that the fault is not his or hers, but the script writer's. Permitting the student to get on with the next lesson regardless does alleviate this somewhat, and the logging facilities make it easy to watch for les- sons that no one can pass, but it is still a problem. The biggest problem with the previous learn was speed (or lack thereof) — it-was often excruciatingly slow and a significant drain on the system. The current version so far does not seem to have that difficulty, although some seripts, notably egn, are intrinsically slow. egn, for example, must do a lot of work even to print its introductions, let alone check the student responses, but delay is perceptible in all scripts from time to time. Another potential problem is that it is possible to break learn inadvertently, by pushing interrupt at the wrong time, or by removing critical files, or any number of similar slips. The defenses against such problems have steadily been improved, to the point where most students should not notice difficulties. Of course, it will always be possible to break learn maliciously, but this is not likely to be a problem. One area is more fundamental — some comimands are sufficiently global in their effect that learn currently does not allow them to be executed at all. The most obvious is cd, which changes to another directory. The prospect of a student who is learning about directories inad- vertently moving to some random directory and removing files has deterred us from even ing lessons on cd, but ultimately lessons or such topics probably should be added. 7. writ- Acknowledgments We are grateful to all those who have tried fearn, for we have benefited greatly from their suggestions and criticisms. In particular, M. E. Bittrich, J. L. Blue, S. [. Feldman. P. A. Fox. and M. J. McAlpin have provided substantial feedback. Conversations with E. Z. Rothkopf also provided many of the ideas in the system. We are also indebted to Don Jackowski for serving * We have even known an expert programmer to decide the computer was broken whien he had simply lefl his terminal in local mode. Novices have great difficulties with such problems. 6-14 Learn as a guinea pig for the second version, and to Tom Plum for his efforts to improve the C script. References L. D. L. Bitzer and D. Skaperdas, ‘‘The Economics of a Large Scale Computer Based Education System: Plato IV,” pp. 17-29 in Compurter Assisted [nstruction, Testing and Guidance, ed. Wayne Holtzman, Harper and Row, New York (1970). D. C. Gray, J. P. Hulskamp, J. H. Kumm, S. Lichtenstein, and N. E. Nimmervoll, “COALA - A Minicomputer CAl System,'” [EEE Trans. Education E-20(1), pp.73-77 (Feb. 1977). P. Suppes, ‘On Using Computers to Individualize Instruction,” pp. 11-24 in The Computer in American Education, ed. D. D. Bushnell and D. W. Allen, John Wiley, New York (1967). }lu B. F. Skinner, ‘“Why We Need Teaching Machines,”” Harv. Educ. Review 31, pp.377-398, Reprinted in Educational Technology, ed. J. P. DeCecco, Holt, Rinehart & Winston (New York, 1964). (1961). K. Thompson and D. M. Ritchie, Unix Programmer’'s Manual, Bell Laboratories (1978). See section ed (I). B. W. Kernighan, 4 rutorial introduction to the UNIX texr editor, Bell Laboratories internal memorandum (1974). B. W. Kernighan and D. M. Ritchie, The C Programming Language, Prentice-Hall, Englewood Cliffs, New Jersey (1978). A Guide to the Dungeons of Doom 6-17 A Guide to the Dungeons of Doom Michael C. Toy Kenneth C. R. C. Arnold Computer Systems Research Group Department of Electrical Engineering and Computer Science University of California Berkeley, California 94720 1. Introduction You have just finished your years as a student at the local fighter’s guild. After much practice and sweat you have finally completed your training and are ready to embark upon a perilous adventure. As a test of your skills, the local guildmasters have sent you into the Dungeons of Doom. Your task is to return with the Amulet of Yendor. Your reward for the completion of this task will be a full membership in the local guild. In addition, you are allowed to keep all the loot you bring back from the dungeons. In preparation for your journey, you are given an enchanted mace, a bow, and a quiver of arrows taken from a dragon’s hoard in the far off Dark Mountains. You are also outfitted with elf-crafted armor and given enough food to reach the dungeons. You say goodbye to family and friends for what may be the last time and head up the road. You set out on your way to the dungeons and after several days of uneventful travel, you see the ancient ruins that mark the entrance to the Dungeons of Doom. It is late at night, so you make camp at the entrance and spend the night sleeping under the open skies. In the morning you gather your weapons, put on your armor, eat what is almost your last food, and enter the dungeons. - 2. What is going on here? You have just begun a game of rogue. Your goal is to grab as much treasure as you can, find the Amulet of Yendor, and get out of the Dungeons of Doom alive. On the screen, a map of where you have been and what you have seen on the current dungeon level is kept. As you explore more of the level, it appears on the screen in front of you. Rogue differs from most computer fantasy games in that it is screen oriented. Com- mands are all one or two keystrokes' and the results of your commands are displayed graphi- cally on the screen rather than being explained in words.* Another major difference between rogue and other computer fantasy games is that once you have solved all the puzzles in a standard fantasy game, it has lost most of its excitement and it ceases to be fun. Rogue, on the other hand, generates a new dungeon every time you play it and even the author finds it an entertaining and exciting game. ! As opposed to pseudo English sentences. 2 A minimum screen size of 24 lines by 80 columns is required. If the screen is larger, only the 24x80 section will be used for the map. 6-18 A Guide to the Dungeons of Doom 3. What do all those things on the screen mean? In order to understand what is going on in rogue you have to first get some grasp of what rogue is doing with the screen. The rogue screen is intended to replace the “You can see ...” descriptions of standard fantasy games. Figure 1 is a sample of what a rogue screen might look like. 3.1. The bottom line At the bottom line of the screen are a few pieces of cryptic information describing your -~ current status. Here is an explanation of what these things mean: Level This number indicates how deep you have gone in the dungeon. It starts at one and goes up as you go deeper into the dungeon. Gold Hp The number of gold pieces you have managed to find and keep with you so far. Your current and maximum hit points. take before you die. Hit points indicate how much damage you can The more you get hit in a fight, the lower they get. regain hit points by resting. You can The number in parentheses is the maximum number your hit points can reach. Str Your current strength and maximum ever strength. or equal to 31, or greater than or equal to three. you are. The number in the parentheses is the maximum strength you have attained so far this game. Ac This can be any integer less than The higher the number, the stronger | Your current armor class. This number indicates how effective your armor is in stop- ping blows from unfriendly creatures. The lower this number is, the more effective the armor. Exp These two numbers give your current experience level and experience points. do things, you gain experience points. experience level. As you At certain experience point totals, you gain an The more experienced you are, the better you are able to fight and to withstand magical attacks. 3.2. The top line The top line of the screen is reserved for printing messages that describe things that are impossible to represent visually. If you see a “--More--" on the top line, this means that rogue wants to print another message on the screen, but it wants to make certain that you Level: 1 Gold: 0 Hp: 12(12) Str: 16(16) Ac: 6 Exp: 1/0 Figure 1 A Guide to the Dungeons of Doom 6-19 have read the one that is there first. To read the next message, just type a space. 3.3. The rest of the screen The rest of the screen is the map of the level as you have explored it so far. bol on the screen represents something. Each sym- Here is a list of what the various symbols mean: @ This symbol represents you, the adventurer. -] These symbols represent the walls of rooms. + A door to/from a room. | The floor of a room. # The floor of a passage between rooms. * A pile or pot of gold. ) A weapon of some sort. ] A piece of armor. ! A flask containing a magic potion. ? A piece of paper, usually a magic scroll. ~ A magical staff or wand > A ring with magic properties A trap, watch out for these. % A staircase to other levels A piece of food. A-7Z The uppercase letters represent the various inhabitants of the Dungeons of Doom. Watch out, they can be nasty and vicious. 4. Commands Commands are given to rogue by typing one or two characters. Most commands can be preceded by a count to repeat them (e.g. typing “10s” will do ten searches). which counts make no sense have the count ignored. <ESCAPE>. Commands for To cancel a count or a prefix, type The list of commands is rather long, but it can be read at any time during the game with the “?” command. Here it is for reference, with a short explanation of each com- mand. ? The help command. Asks for a character to give help on. If you type a “*”, it will list all the commands, otherwise it will explain what the character you typed does. / This is the “What is that on the screen?” command. A “/” followed by any character that you see on the level, will tell you what that character is. For instance, typing “/@” will tell you that the “@?” symbol represents you, the player. h, H, "H You move one space to the left. If you use upper case ‘“h”, you will continue to move left until you run into something. Move left. This works for all movement commands (e.g. “L” means run in direction “1”’) If you use the “control” “h”, you will continue moving in the specified direction until you pass something interesting or run into a wall. You should experiment with this, since it is a very useful command, but very difficult to describe. This also works for all movement commands. ] Move down. k Move up. Move right. BT o < Move diagonally up and left. Move diagonally down and right. <& [—— 6-20 A Guide to the Dungeons of Doom Throw an object. Move diagonally up and right. Move diagonally down and left. This is a prefix command. an object in the specified direction. Fight until someone dies. When followed with a direction it throws (e.g. type “th” to throw something to the left.) When followed with a direction this will force you to fight the creature in that direction until either you or it bites the big one. Move onto something without picking it up. This will move you one space in the direc- tion you specify and, if there is an object there you can pick up, it won’t do it. Zap prefix. Point a staff or wand in a given direction and fire it. Even non-directional staves must be pointed in some direction to be used. Identify trap command. If a trap is on your map and you can’t remember what type it is, you can get rogue to remind you by getting next to it and typing ‘“*” followed by the direction that would move you on top of it. Search for traps and secret doors. Examine each space immediately adjacent to you for the existence of a trap or secret door. There is a large chance that even if there is some- thing there, you won’t find it, so you might have to search a while before you find something. Climb down a staircase to the next level. Not surprisingly, this can only be done if you are standing on staircase. Climb up a staircase to the level above. This can’t be done without the Amulet of Yendor in your possession. x Inventory. List what you are carrying in your pack. = Selective inventory. Tells you what a single item in your pack is. 0 Quaff one of the potions you are carrying. = This is good for waiting and healing. Read one of the scrolls in your pack. © This is the “do nothing” command. Eat food from your pack. § Rest. Wield a weapon. Take a weapon out of your pack and carry it for use in combat, replac- v 4 = ing the one you are currently using (if any). Wear armor. You can only wear one suit of armor at a time. This takes extra time. Take armor off. Put on a ring. You can’t remove armor that is cursed. This takes extra time. You can wear only two rings at a time (one on each hand). If you aren’t wearing any rings, this command will ask you which hand you want to wear it on, otherwise, it will place it on the unused hand. The program assumes that you wield your sword in your right hand. Remove a ring. If you are only wearing one ring, this command takes it off. If you are wearing two, it will ask you which one you wish to remove, Drop an object. Take something out of your pack and leave it lying on the floor. one object can occupy each space. Only You cannot drop a cursed object at all if you are wielding or wearing it. Call an object something. If you have a type of object in your pack which you wish to remember something about, you can use the call command to give a name to that type of A Guide to the Dungeons of Doom 6-21 object. This is usually used when you figure out what a potion, scroll, ring, or staff is after you pick it up, or when you want to remember which of those swords in your pack you were wielding. D Print out which things you've discovered something about. what type of thing you are interested in. object (e.g. This command will ask you If you type the character for a given type of “!” for potion) it will tell you which kinds of that type of object you’ve discovered (i.e., figured out what they are). This command works for potions, scrolls, rings, and staves and wands. 0 "R Examine and set options. This command is further explained in the section on options. Redraws the screen. Useful if spurious messages or transmission errors have messed up the display. "P Print last message. Useful when a message disappears before you can read it. This only repeats the last message that was not a mistyped command so that you don’t loose any- thing by accidentally typing the wrong character instead of "P. <ESCAPE> Cancel a command, prefix, or count. r Escape to a shell for some commands. Q Quit. Leave the game. S Save the current game in a file. file. It will ask you whether you wish to use the default save Caveat: Rogue won’t let you start up a copy of a saved game, and it removes the -save file as soon as you start up a restored game. This is to prevent people from saving a game just before a dangerous position and then restarting it if they die. To restore a saved game, give the file name as an argument to rogue. As in % rogue save file To restart from the default save file (see below), run l—-dv< % rogue —r Prints the program version number. Print the weapon you are currently wielding Print the armor you are currently wearing Print the rings you are currently wearing @ Reprint the status line on the message line 5. Rooms Rooms in the dungeons are either lit or dark. If you walk into a lit room, the entire room will be drawn on the screen as soon as you enter. only be displayed as you explore it. erased from the screen. If you walk into a dark room, it will Upon leaving a room, all monsters inside the room are In the darkness you can only see one space in all directions around you. A corridoris always dark. 6. Fighting If you see a monster and you wish to fight it, just attempt to run into it. monster you find will mind its own business unless you attack it. Many times a It is often the case that dis- cretion is the better part of valor. 7. Objects you can find When you find something in the dungeon, it is common to want to pick the object up. This is accomplished in rogue by walking over the object (unless you use the “m” prefix, see 6-22 A Guide to the Dungeons of Doom above). If you are carrying too many things, the program will tell you and it won’t pick up the object, otherwise it will add it to your pack and tell you what you just picked up. Many of the commands that operate on objects must prompt you to find out which object you want to use. If you change your mind and don’t want to do that command after all, just type an <ESCAPE> and the command will be aborted. Some objects, like armor and weapons, are easily differentiated. Others, like scrolls and potions, are given labels which vary according to type. During a game, any two of the same kind of object with the same label are the same type. However, the labels will vary from game to game. When you use one of these labeled objects, if its effect is obvious, rogue will remember what it is for you. If it’s eifect isn’t extremely obvious you will be asked what you want to scribble on it so you will recognize it later, or you can use the “call” command (see above). 7.1. Weapons Some weapons, like arrows, come in bunches, but most come one at a time. In order to use a weapon, you must wield it. To fire an arrow out of a bow, you must first wield the bow, then throw the arrow. You can only wield one weapon at a time, but you can’t change weapons if the one you are currently wielding is cursed. The commands to use weapons are “w” (wield) and “t” (throw). 7.2. Armor There are various sorts of armor lying around in the dungeon. Some of it is enchanted, some is cursed, and some is just normal. Different armor types have different armor classes. The lower the armor class, the more protection the armor affords against the blows of monsters. Here is a list of the various armor types and their normal armor class: Type Class 10 None Leather armor 8 Studded leather / Ring mail Scale mail 7 S ¢ Chain mail 5 Banded mail / Splint mail 4 Plate mail 3 If a piece of armor is enchanted, its armor class will be lower than normal. If a suit of armor is cursed, its armor class will be higher, and you will not be able to remove it. However, not all armor with a class that is higher than normal is cursed. The commands to use weapons are “W?” (wear) and “T” (take off). 7.3. Scrolls Scrolls come with titles in an unknown tongue®. After ydu read a scroll, it disappears 3 from your pack. The command to use a scroll is “r” (read). 7.4. Potions Potions are labeled by the color of the liquid inside the flask. being quaffed. The command to use a scroll is “q” (quaff). They disappear after 3 Actually, it’s a dialect spoken only by the twenty-seven members of a tribe in Outer Mongolia, but you're not supposed to know that. A Guide to the Dungeons of Doom 6-23 7.5. Staves and Wands Staves and wands do the same kinds of things. Staves are identified by a type of wood; wands by a type of metal or bone. They are generally things you want to do to something over a long distance, so you must point them at what you wish to affect to use them. Some staves are not affected by the direction they are pointed, though. Staves come with multiple magic charges, the number being random, and when they are used up, the staff is just a piece of wood or metal. The command to use a wand or staff is “z” (zap) 7.6. Rings Rings are very useful items, since they are relatively permanent magic, unlike the usually fleeting effects of potions, scrolls, and staves. Of course, the bad rings are also more powerful. Most rings also cause you to use up food more rapidly, the rate varying with the type of ring. Rings are differentiated by their stone settings. and “R” (remove). 7. The commands to use rings are “P” (put on) | 7. Food Food is necessary to keep you going. If you go too long without eating you will faint, and eventually die of starvation. The command to use food is “e” (eat). 8. Options Due to variations in personal tastes and conceptions of the way rogue should do things, there are a set of options you can set that cause rogue to behave in various different ways. 8.1. Setting the options There are two ways to set the options. The first is with the “0” command of rogue; the second is with the “ROGUEOPTS” environment variable?. 8.1.1. Using the ‘0’ command When you type “o” in rogue, it clears the screen and displays the current settings for all the options. type. It then places the cursor by the value of the first option and waits for you to You can type a <RETURN> which means to go to the next option, a “~” which means to go to the previous option, an <ESCAPE> which means to return to the game, or you can give the option a value. For boolean options this merely involves typing “t” for true or “f’ for false. For string options, type the new value followed by a <RETURN>. 8.1.2. Using the ROGUEOQOPTS variable The ROGUEOPTS variable is a string containing a comma separated list of initial values for the various options. Boolean variables can be turned on by listing their name or turned off by putting a “no” in front of the name. Thus to set up an environment variable so that jump is on, terse is off, and the name is set to “Blue Meanie”, use the command % setenv ROGUEOPTS ”jump,noterse,name=Blue Meanie”’ * On Version 6 systems, there is no equivalent of the ROGUEOPTS feature. ® For those of you who use the bourne shell, the commands would be $ ROGUEOPTS="jump,noterse,name=Blue Meanie” $ export ROGUEOPTS 6-24 A Guide to the Dungeons of Doom 8.2. Option list Here is a list of the options and an explanation of what each one is for. The default value for each is enclosed in square brackets. For character string options, input over fifty characters will be ignored. terse [noterse] Useful for those who are tired of the sometimes lengthy messages of rogue. This is a useful option for playing on slow terminals, so this option defaults to terse if you are on a slow (1200 baud or under) terminal. jump [nojump] If this option is set, running moves will not be displayed until you reach the end of the move. This saves considerable cpu and display time. This option defaults to jump if you are using a slow terminal. flush [noflush] All typeahead is thrown away after each round of battle. This is useful for those who type far ahead and then watch in dismay as a Bat kills them. seefloor [seefloor] Display the floor around you on the screen as you move through dark rooms. Due to the amount of characters generated, this option defaults to noseefloor if you are using a slow terminal. passgo [nopassgo] Follow turnings in passageways. If you run in a passage and you run into stone or a wall, rogue will see if it can turn to the right or left. If it can only turn one way, it will turn that way. If it can turn either or neither, it will stop. This is followed strictly, which can sometimes lead to slightly confusing occurrences (which is why it defaults to nopassgo). tombstone [tombstone] Print out the tombstone at the end if you get killed. This is nice but slow, so you can turn it off if you like. inven [overwrite] Inventory type. This can have one of three values: overwrite, slow, or clear. With overwrite the top lines of the map are overwritten with the list when inventory is requested or when “Which item do you wish to . ..? ” questions are answered with a “*”. However, if the list is longer than a screenful, the screen is cleared. With slow, lists are displayed one item at a time on the top of the screen, and with clear, the screen is cleared, the list is displayed, and then the dungeon level is re-displayed. Due to speed considerations, clear is the default for terminals without clear-to-end-of-line capabilities. name [account name] This is the name of your character. It is used if you get on the top ten scorer’s list. fruit [slime-mold] This should hold the name of a fruit that you enjoy eating. | It is basically a whimsey that rogue uses in a couple of places. file ["/rogue.save] The default file name for saving the game. If your phone is hung up by accident, rogue will automatically save the game in this file. The file name may start with the special character ‘““” which expands to be your home directory. 9. Scoring Rogue usually maintains a list of the top scoring people or scores on your machine. Depending on how it is set up, it can post either the top scores or the top players. In the A Guide to the Dungeons of Doom 6-25 latter case, each account on the machine can post only one non-winning score on this list. If you score higher than someone else on this list, or better your previous score on the list, you will be inserted in the proper place under your current name. How many scores are kept can also be set up by whoever installs it on your machine. If you quit the game, you get out with all of your gold intact. If, however, you get killed in the Dungeons of Doom, your body is forwarded to your next-of-kin, along with 90% of your gold; ten percent of your gold is kept by the Dungeons’ wizard as a fee®. This should make you consider whether you want to take one last hit at that monster and possibly live, or quit and thus stop with whatever you have. If you quit, you do get all your gold, but if you swing and live, you might find more. If you just want to see what the current top players/games list is, you can type % rogue —s 10. Acknowledgements Rogue was originally conceived of by Glenn Wichman and Michael Toy. Ken Arnold and Michael Toy then smoothed out the user interface, and added jillions of new features. We would like to thank Bob Arnold, Michelle Busch, Andy Hatcher, Kipp Hickman, Mark Horton, Daniel Jensen, Bill Joy, Joe Kalash, Steve Maurer, Marty McNary, Jan Miller, and Scott Nelson for their ideas and assistance; and also the teeming multitudes who graciously ignored work, school, and social life to play rogue and send us bugs, complaints, suggestions, and just plain flames. And also Mom. ® The Dungeon’s wizard is named Wally the Wonder Badger. Invocations should be accompanied by a sizable donative. Berkeley Font Catalogue 6-27 Berkeley Font Catalogue Introduction This catalog gives samples of the various fonts available at Berkeley using vtroff on our Versatec and Varian. We have them working 4 pages across in a 36 inch Versatec, and rotated 90 degrees on a Benson-Varian 11 inch plotter. The same software should be adaptable to an 11 inch Versatec, and in fact is running at several other sites, however, not having one here, it isn't part of this distribution. Such a driver is available from Tom Ferrin at UCSF. To use these fonts: (1) Hershey. This is the default font. The Hershey font is currently the only complete font, with all 16 point sizes and all the special characters trof knows about. To get it, use vtroff directly. To illustrate this with the —ms macro package: vtroff —ms paper.nr (2) Fonts with roman, italic, and bold, such as nonie. You can load all three fonts with, for example: viroff —F nonie —ms paper.nr To get just one of these fonts, use (3) below, appending .r, .i, or .b to the font name to specify which font you want mounted, e.g., to get italics in delegate, vtroff —2 delegate.i —ms paper.nr (3) To get a font without a complete set, choose which font (1, 2, or 3) you want replaced by the chosen font. For example, to use bocklin as though it were bold, since font 3 is bold, use: | vtroff —3 bocklin —ms paper.nr To switch between fonts in troff, use gt 3 to switch to font 3, for example, or use \f3word\f1 to switch within a line. For more information see the Nrofl/Troff Users Manual. Special note: troff thinks it is talking to a CAT phototypesetter. Thus, it does all sorts of strange things, such as enforcing restrictions like 7.54 inches maximum width, 4 fonts, a certain 18 point sizes, proportional spacing by point size, etc. In particular, the following glyphs will always be taken from the special font, no matter what font you are using at the time: @ #". < >\0L)~~ and - This may explain what are otherwise surprising results in some of the subsequent pages. | In addition, the following Greek letters have been decreed by troff as look- ing so much like their Roman counterparts that the Roman version (font 1) is always printed, no matter what font is mounted on font 1 at the time: AB,E,Z.HILK M\NO P T X. (See table II in the back of the Nrofl/Troff Users's Manual for details about what glyphs are in each font and how to generate the special glyphs.) 6-28 Berkeley Font Catalogue Font Layout Positions Code Narmal 000 001 1 002 1 003 004 . \(ru | v \N(em | © \N(bu | @ \(cu || 105 \(rn || 108 \(bs || 107 012 013 014 ° t 015 018 017 ' ® ® 8 ¥ % 3 083 \(ct | | \(14 | | \N(12 | [ \(34 [ ] [ l ] J { b C N 033 038 037 040 | space O \(sq | £ \(fl | s \@L | 2 \(de | v \(dg | ¢ \(tm | S \(eo | / \(rzg | | 103 = € \(sp || 133 \(oo !! 135 \(h 138 \(mo | 137 $ 140 141 142 143 144 145 & 062 083 054 ’ + , 080 081 062 083 0 1 2 3 = a ~ ® \(==| \(~=1l \(ap || \(!= 5 8 * \(> 185 \(ua || 168 8 § 065 068 057 064 085 068 087 070 071 o o3 Q74 : / 4 7 9 : ; 075 078 LO77 2 P Q R S T U Vv W X Y VA [ ] a b ¢ d 2 c N O -~ t a g vy d z N\(*P \(*R \(*S \(°T \(°U \(°F \(%X \("Q \(*Ww \(dd \(br \(ib \e \(di \' \(*a \(*» \(*g \(*d \(% j k 1 e A B \(*k \(" \(*m P q r s T g g T \(* \(*r \(°s \(% \(mi || 155 | 1%6 \(@ || 157 151 180 181 182 183 \(<- || 184 i m n o t u v \(da || 187 P \(** || 171 v \(sc [T P T T T ¢ X ¥ 0 T 152 183 154 - 170 172 173 < 174 > 178 = \(°C \(*0 \(*z \(% \(*h N\mu| \(pl e = 0 ¢ n ¥ X + ¢ \(H \(f \(*K \(°L \(*M \(*N| NG \(*D f g h \g - 8 I K A M N A 148 147 150 ‘ vV + \(°E \(Z \(°Y N 0 % ( E Z H 118 117 " ) E F G 112 113 114 115 | | ' \0O=j \(sr || \(ts \({s || \(d \bv | \(ca || 134 \(°A \(*B C H I J K L u 120 121 122 123 124 125 128 127 130 131 132 A B D 110 111 048 047 050 051 A \(+ || \<=1i \(lf \(rt \(le \(re \(t \(b \(rt \(rb \(Ik \(k \(sb Special o B - | = \(Fh || 104 5> 034 045 101 | 102 \(pt o 041 042 043 \({f \(p Normal 100 \# | = 010 011 020 021 022 023 024 025 028 o7 030 o031 032 \#A | = \(®@ | 2 Code od - 005 006 007 Special 175 177 X - i ¢ v £ o v 2 X ¥ @ 3 ¢ § \(4 \(*a \(% \(% \(*u \(% \= \(%q \(*w \(pd \(es § | \er ~ ~ ] | Berkeley Font Catalogue APL FONT, 10 POINT ONLY A« BLCND|E€ F_GuHoI\J-K' LOMINTO0oPs Q?ReSIT~UL 6-29 VUW XY TZc 01234 56789 (" #8mx2VAlHA+s{}fl~~_NTB3<+/\.>,« !-no(z-bn&-ox'-b'(—’\/')—o/\_‘-‘s’4*--&#:—0-6[4{]4}{43'4* ’ S I R A TR Baskerville font, roman, ibold, italic, 12 point only (Called "basker” on line.) ABCDE FGHIJ] KLMNO PQRST UVWXYZ abcde fghij klmno pqrst uvwxyz 01234 56789 1" 482 &():a-=[1f{]~~_N\N|@";+/?.>,< If time be of all things the most precious, wasting time must be, as Poor Richard says, the greates: prodigality; since, as he elsewhere tells us, lost time is never found again; and what we call time enough, always proves little enough: Let us then up and be doing, and doing to the purpose; so by diligence shall we do more with less perplexity. | ABCDE FGHI] KLMNQ PQRST UVWXYZ abcde fghij kimno pqrst uwvwxyz 01234 56789 P ES2L ()ie-m [ )il A~_N|@;4+12.>,< If time be of all things the most precious, wasting time must be, as Poor Richard says, the greatest prodigality; since, as Ae elsewhere tells us, lost time is never found again; and what we call time enough, always proves little enough: Let us then up and be doing, and doing to the purpose; so by diligence shall we do more with less perplexity. ABCDE FGHI] KLMNO PQRST UVWXYZ abcde fghij kimno pqrst uvwxyz 01234 56789 1" 487& ()ia-al]f]~~_N|1@';+/2.>,< If time be of all things the most precious, wasting time must be, as Poor Richard says, the greatest prodigality; since, as he elsewhere tells us, lost time is never found again; and what we call time enough, always proves little enough: Let us then up and be doing, and doing to the purpose; so by diligence shall we do more with less perplexity. 6-30 Berkeley Font Catalogue Bocklin font, 14 and 28 point only. 14 point ABCDE FGNLT] KLAINO PQRST UVWXYZ abede Ighij kRlmno pgrst uvwxyz 01234 56789 T():==01":/2., H time be of all things the most precious, wastingfirime must be, as Poor Richard savys, the greatest prodigality; since, as he clsewhere tells us, lost time is never lound again; and what we call time enough, always proves little encugh: Let us then ucf and be doing, and doing to the purpose; so by diligence shall we do more with less perplexity. 28 pOinl' (Ro punctuation except period.) ABCDE FGNIJ] KLTINO PORST dVWXYZ abede Ighij klmno pgrst uvwxyz 01234 56789 . I time be of all things the most precious wasting time must be Poor Richard says prodigality tells us again since as the greatest as he elsewhere lost time is never found and what we call time enough always proves liftle enough then up and be doing the purpose Lef us and doing to so0 by diligence shall we do more with less perplexity. Berkeley Font Catalogue 6-31 Bodoni font, roman, bold, italic, 10 point only. ABCDE FGHIJ KLMNO PORST UVWXYZ abcde fghij klmno pqrst uvwxyz 01234 56789 1" B8%& ()ix-a[]f]~~NI@ ;+/2.>,< If time be of all things the most precious, wasting time must be, as Poor Richard says, the greatest prodigality; since, as he elsewhere tells us, lost time is never found again; and what we call time enoggh, always proves little enough: Let us then up and be doing, and doing to the purpose; so by diligence shall we do more with less perplexity. ABCDE FCHIJ KLMNO PQRST UVWXYZ abcde fghij klmno pqrst uvwxyz 01234 56789 1" #872&°():k-=[]]]~~N\N[@®*;+/2.5,< If time be of all things the most precious, wasting time must be, as Poor Richard says, the greatest prodigality; since, as he elsewhesre tells us, lost time is never found again; and what we call time enough, always proves little enou gh: Let us then up and be doing, and doing to the purpose; so by diligence shall we do more with less perplexity. ABCDE FGHIJ XLMNO PQRST UVWXYZ abede fghij kimno pqrst uvwxyz 01234 56789 1" #82&°():x-=[]{]{~~N[@%5+/2.>,< I time be of all things the most precious, wasting time must be, a8 Poor Richard says, the greatest prodigality; since, as he elsewhere tells us, lost time is never found again; and what we call time enough, always proves little enough: Let us then up and be doing, and doing to the purpose; so by diligence chall we do more with less perplexity. 6-32 Berkeley Font Catalogue Chess, 18 point only Note: Our attempt at compatibili ty with Stanford was only 99% successful. If you uss a blank space to indicate an empty white s quare it will come out narrow due to the stupidity of troff. Either include the line .cs ch 38 to put yourself in constant spacin g mode or else use zero instead of space. You YOZOZOAOZF YZ0Z0ZOOOoF Y0o0Z0ZOZF YZ0Z0Z0ZoF YOMOZOZOZF VjPZOZOZOF VOZKZOZOZF YZ0Z0Z0ZOF . Sp It P .ps 8 N\ #E N N ,@,/ / / .cs P VYhite mates in three moves. SR %O/@No/&_/y//N,/,/R@/N”///,w//% SAN\M/// AN NV N LRRo= MI<R Nz sh ould also set the vertical spacing to 18 points. Berkeley Font Catalogue Clarendon, 14 and 18 point roman only. 6-33 From SAIL (Paul Martin & Andy Moorer) ABCDE FGHIJ ELMNO PQRST UVWXY abcde fghij klmno pqr. uvwxyz 01234 568789 "HF#SIx’(): ~=[]il A~ _\N]O@ 3+ /2.>,< If time be of all things the most precious, wasting time must be, as Poor Richard says, the greatest prodigality; since, as he elsewhere tells us, lost time is never found again; and what we call time enough, always proves little enough: Let us then up and be doing, and doing to the purpose; so by diligence shall we do more with less perplexity. ABCDE FGHIJ KLMNO PQRST UVWXY abcde fghij klmno pqrst uvwxyz 01234 567889 TH I S () -=[]1 i~~~ \[@ 5+ /2 .>,K< If time be of all things the most precious, wasting time must be, as Poor Richard says, the greatest prodigality; since, as he elsewhere tells us, lost time is never found again; and what we call time enough, always proves little enough: Let us then up and be doing, and doing to the purpose; so by diligence shall we do more with less perplexity. 6-34 Berkeley Font Catalogue Computer Modern fomis,r crnan, italie, and beld.(by Don Knuth) 6,7,8,9,10,11,12 paint. (Avsilable as cmm) Note thet the cin fonts are intended for TEX and doa’t fare so well with troff. The specing is net propoe- tional by point sise,and hencs oaly one point size can be tunsd to be nicely spaceds We have tuned she 10 point sige,but the 8 point looks scomwhad crampeds Somms of the punctuastion is missing in soms of the fontse Knuth also uses a nonstandard nosiom of ASCII, and hence sams glyphs sre available only with special symbols such as \{12: Othars cannot be accessed a all Koubh's fonts somswhai larger than normal, sincs he intends the output to be reduced before prioting Since troff hes o limitaion of T84 inches width on output,shis (s nod practical. Hance, the original fomts bave been relabelled with the peimt sise they are clasest to withous reductions Samm fonts (6 peint bold,7 paint roman, 8 point italic and bold,9 point bold,and 11 peint italic) which would have otherwise been missing were ganerated by shrinking the next larger point sise of the saxpe siyle. (This goes agminst the idea of metafont, but we use the tocls we have) 10 Point Roman ABCDE FGHIJ KLMNO PQRST 56789 ! * # 10§ ’ () UVWXY?Z XYZ abede fghij klmno pqrst uvwxys 01234 ' i ; ~~o\ @ "? o2 ,< ’nxsfiasTa°xnnnAserAa‘psnvunn If time be of all things the most precious,wasting time must be,as Poor Richard says,the greatest prodigality since,as he elsewhere tells us,lost time is never found again and what we call time enough,always proves little enough Let us then up and be doing,and doing to the purpose so by diligence shall we do more with less perplexity. 10 Pasnt Italse ABCDE FGHIJ KLMNO PQRST UVWXYZ abede fghis kimno parst uvways 01284 56789 1" #p 0 ’():¥*-=[]{l~~_ \NWw@ ;+/?2.>,<" " &, 8T @ 11, m6,6 4,6, 4, % 0,0 8,76 If time be of all things the most precious, wasting iime must be, as Poor Richard 36YS, the greatest prodigality; since, as he elsewhere tells us, lost time s never found again, and what we call time enough, always proves little enough: Let ug then up and be doing, and doing to the purpose; so by diligence shall we do more with less perplezity. 10 Point Bold ABCDE FGHIJ KLMNO PQRST UVWXYZ sbede fghij kimno pqrst uvwxys 01234 58789!”#&%&'()3*-?—3”'5;““_\fi@‘;-{-’/?.),<H',E,-,~,§,T,Q,ng 9Ty T Ay O, A, 8, N1,y "y gy If time be of all things the most preclous, wasting time must be, ss Poor Richard 5aY8, the greatest prodigality; since, a3 he elsewhere tells us, lost time is never found againg and what we call time enough, always proves little enoughs Let us then up and be doing, and doing to the purpose; so by diligence shall we do more with less perplexity. 8 Putot Roman, Bold, and Balde Y Podot Reoman, Bold, and falia 8 Point Roman,Beld, and Jtalie 8 Point Roman, Bold, and Italse, 10 Point Roman,Bold,and Italte 11 Point Roman,Bold,and Italse. 12 Point Roman,Bold,and Italic. Berkeley Font Catalogue 6-35 Countdown (22 point, upper case letters only.) From SAIL (Paul Martin) BECOE F3HU ALTIND PORST LULLNLS GOUATIOWLN HAS 0 NTEGERS TOD GOLAT QULLM WUITH BUT 1T GOMIPENSATES BY BEING L1V AND RLEGEELE Cyrillic, 12 point only X123 alge ¢rzu xauuo aper ysha $ rume Ge od AN TIUEIC TXe MOCT NPeHOYC ACTHHI THMe MycT Oe ac ocop uxapsg cafic TXe rpeaTec? OPORMraiuTih CHHe aC Xe efcexepe TeMIC yC JOCT THME WC Hesep OYHNA araum aHX XaT € al THME eHOYrX anafic opoBeC AMTTAE eHOYTX T yC TXeR yu aHX Ge ROWHr aRX XOHMHr TO TXe nypnoce co 6f guaurese cxann ¢ J0 Mope HTX Ject nepniernTh W-X X-U Y-+ Z-»3 a—a b-6 d-»z e~e f>¢ g->r h-x i»u k-x |»a m-« n—a o0 p-a Ir-p S=¢ L-7 U~y V-8 y-ft Z-8 Delegate, roman, italic, and bold, 12 point only ABCDE FGHIJ KLMNO PQRST UVWXYZ abcde fghij klmno pqrst uvwxyz 01234 56789 1" #5538 Osa-a[]{]~~_N\N1@t;+/7.>,X< £ time be of all things the most precious, wasting time must be, as Poor Richard says, the greatest prodigality; since, as he elsewhere tells us, lost time is never found again; and what we call time enough, always proves little enough: Let us then up and be doing, and doing to the purpose; so by diligence shall we do more with less perplexity. ABCDE FGHIJ KLMNO PQAST UUWXYZ abcde fghij klmno pqrst uvwxyz 01234 56789 N\NI@ ]}~~~ & ()= s% I1"H ;+/?2.>,< If time be of all things the most precious, wasting time must be, as Poor Richard says, the greatest prodigality; since, as he elseuwhere tells us, lost time is never found again; and what we call time enough, always proves little enough: Let us then up and Je doing, and doing to the purpose; so Dy diligence shall we do more with less perplexity. 6-36 Berkeley Font Catalogue ABCDE FGHIJ KLMNO PQRST UVWXYZ abcde fghij k1mmo pqrst uvwxyz 01234 56789 PP #8238 OQ:te-=[]1f] ~~_N\N1@t ;+/7.>, < If time be of all things the most precious, wastin g time must be, as Poor Richard says, the greatest prodigality; since, as he elsewhere tells us, lost time is never found again; and what we call time enough , always proves little enough: Let us then up and be doing, and doing to the purpos e; so by diligence shall we do more with less perplexity. | Fix fixed width font, 6, 9, 18, 12, 14 point € moint ASLDE FUMIJ KLMNQ PORST LAMXY shede fghi) klane parst uwens 01234 S578S P88 ()imen ] ] *1¢/%2.9, ¢ If ties ba of oll things tha sest precicus. westing tise sust be. s Poor Richard says. the grestest srodiselitys sincs, es he elsmtvers tells us, lost tise te the pursesa: is never found sgein: end what @ call time enough, sluwers proves littls enoughs se by diligence shell we do sere with less sarplaxity. Lat ue then up end be deing, and dein 8 point ABCOE FGHIJ KLANG PQRST UVHXY abede fghij kKimne pgrst uvuxyz 81234 567389 !”#3:4*()::-.(1{§~~_\|0* ;+/?.>,< It time Be of all greatest uhat to we the things prodiqgality; call time purpose; se enough, by the most precious, since, as always diligence he wasting elsemherse proves shall tells little we do time must us, snough: mere with lost be, time as Poor Richard sagl, is never Let us less perplexity., then up and faound be the égain; doing, and and dein: 18 point ABCDE FGHIJ KLMNO PQORST UYWXY abcde fghij kimno pgrst uvwxyz 81234 58789 L BBE ()i meal]l [f time be of all 8ays, and what we call and be doing, perplexity. anm_\N[@ 3 +/7.>, < things the most precious, wasting time must be, the greatest prodigality; found againg UP [ and doing to sincs, as he slseubers talls us, time snough, always proves the purpose; so by diligences as Poor Richard lost time |little emough: shal| is never Lat us then we do more with les Berkeley Font Catalogue 6-37 12 point ABCDE FGHIJ KLMNO PORST UVWXY abcde fghij kimno pgrst uvwxyz 81234 SE783 | " 4828 () :ix-=[1f}~~_\N|@"';+/7?2.>,< [f time be of all things the most preclous, wasting time must be, as Poor Richard says, the greatest prodigality; sincs, as he elsewhers tel!s us, lost time is never found again; and what we call time snough, always proves |ittle enough: Let us then up and be doing, and doling to the purpose; so by diligence shall we do more With less perplexity. 14 point ABCDE FGHIJ KLMNO PORST UVKXY abcde fghij kimna porst uvixyz 81234 55783 P y 3 8%8° ()eix-=[1f1~a~_N]@‘s+/72.> < [f time be of all things the most precious, wasting time must be, as Poor Richard says, the greatest prodigality: since, as he elsewhere tells us, again; and what We call lost time is mever found time enough, always proves little enough: Let us then up and be doing, and doing to the purpose; sa by diligence shall perplexi ty. we do more With lsss 6-38 Berkeley Font Catalogue Gacham, roman, The gacham bold, font italic, is almost 18 point only indistinguishable from the fix font. In pointed out that our gacham roman and bold fonts really are fix. cluded anyway fact, Sigh. it has been They are in- for convenience. ABCOE FGHIJ KLMNO PQRST UVWXYZ abcde fghij klmno parst uvuxyz 81234 SE783 P #8287 ()t [f time be of all says, x-al]l fla~n_N|@‘;+/2.>, things the most precious, the greatest prodigality; found again; and what we call up and be doing, and doing to since, ¢ wasting time must be, as he elsewhere tells us, time enough, always proves the purpose; so by diligence shall little as Poor Richard lost time enough: is never Let us then we do more with less perplexity. ABCDE FGHIJ KLMNO PQRST UWXYZ abcde fghij klmmo pqrst uwwxyz 01234 56789 [ " $8 () %& :"-=s[ ° ]F]~~_N\NI|® ;+/2.>,< If time be of all things the most precious, wasting time must be, as Poor Richard says, the greatest prodigality; since, as he elsewhere tells us, found again; and what we call time enough, up and be doing, 7ost time always proves little enough: and doing to the purpose; so by diligence shall we 1s never Let us then do more with lesz.: perplexity. ABCTE FGHIJ KLIMNO PORST UMUIXYZ abcde fghij kimno parst uvinayz 81234 56788 P" #828&° () s g-=[] f]~~_\N|B®*:s+/2.>, < [f time be of a!l'things the most precious, wasting time must be, as Poor Richard says, the greatest prodigality; found again; and what we call up and be doing, and daing since, as he elsewhere tells us, time enough, always proves to the purpose; lost time little enough: so by diligence shal! is never Let us then we do more with perplexity. Greek, 10 point only This font provides an alternative to the Greek characters on the standard special font. ABRCDE FGHIJ KIMNO PQRST UVWXYZ abcde fghij kimno pqrst uvwxy ABXAE ¢T'Hly XKAMNO IO8PIT TONZV¥Z afxids by md v rlpev vswéyl 1$ rips S 0@ @\ rerye T9¢ pOCT TREXIOVWE warTIFy rius aver S ar [loop Pixnapd cayo roe ypsarser roedivaMTy ot &F P (AFEwNE P TIAG VO MOCP Tiué & TMTep $Ourd @TElr ol wWHOT we YA\ Tipd FOVYY @\waye TPotee MTTAs ¢»ovyy der wo reery vw ad Se boury B4 louwry ro res reprore g0 S Siverys sgad) ws $0 30g8 wery Meov TIPWIsTY les: Berkeley Font Catalogue 6-39 The h1S font includes a subset of the hl3’s graphic charactor set, plus a few logical extensions to allow forms and diagrams to be drawn. The charactors are the same as the hlS’s graphic interpratation set. ¢ a b e d e f s t uv mnh i k | | =494t rrdtb-t2elT The charactors ars designed to overlap. Example of usage for diagrams: MCER289 DESIGN MODULE: % 16-bit CPU x 32K bytes RAM > Terminal % 16-bit timers Y BOK bytes RAM SN 288 microcomputer system # 8K bytes monitor ROM % Parallel Ports > B4K bytes RAM 6-40 Berkeley Font Catalogue Hebrew, 16, 24, and 38 point only 18 point N22TY B2y 29 opuwe nomt 85 Y 358 2 nATM 01234 56789 1 #E R, () 1A~ N\NT1@";y 2.>,K< BIY BOANT Bur mr N3 yreyShyie ontmey 5T oTEWIT SUpUTTRRTIR. YT ND b NIWIENAY W SpT NOown UmpNtNstY. bym b ¥ ST TR emeTy. T Np §m n 2 8Tl MeES. 24 point N30 v p =B mptww nmn - &Y. mUORARTR DY T DY M W = NTN2S 38 point (rather ragged) ) YR DS mporn M RY BT ONTINIANG e e 33 RO Berkeley Font Catalogue 6-41 10 point Hershey ABCDE FGHIJ KLMNO PQRST UVWXYZ abcde fghij klmno pqrst uvwxyz 01234 56789 !, §, zv &o ’n (v )n " ‘r-o [u ]o 'e:a /o ?a ° \(em » — — =+ -, \= = =, \(bu e \(sq =9 \(ru=- _\(14 -} \(12 - % \(34 - {, \({fi fi, \(fl » 1, \(ff » 2, \(Fi » 8, \(F1 - @, \(de = °, \(dg = t. \(fm = ', \(ct » &\(rg » @ \(co -+ © When you flex your fingers in a coffin, it can baflle a girafle. ABCDE FGHIJ KLMNO PQRST UVWXYZ abcde fghij klmno pgrst uvvwzyz 01234 56789 !, 3lz'&l ,l (0)0 '.l ’;'Q[l]l.!;. /0?0' \(em = — = = -, \= = = \(bu +¢ \(3q =5 \(ru - _ \{14 -+ ¥\(12 » %\(34 -» I\(1i - fi, \(fl = f1, \(#f = . \(Fi > fi, \(F1 » 1, \(de = °, \(dg = 1. \(fm = ', \(ct = $\(rg » ®\(co -+ 8 When you flez your fingers tn a coffin, il can baffle a giraffe. ABCDE FGHIJ KLMNO PQRST UVWXY Z abede fghij klmno pqrst uvwxyz 01234 568789 !, §. x&o'o(u)| e 0o"’a[o]u'o;» /.?po \(em -+ — = = =, \= =+ — \(bu + o \(3q » & \(ru - _, \(14 » \(12 » ¥B\(34 » I\(ti - f1, \(fl = 1, \(ff - £, \(Fi » [, \(F1 - M, \(de = *, \(dg = §. \(Im = ', \(ct » $\(rg ~» ®\(co -+ & When you flex your fingers in a coffin, it can baffle a giraffe. From specialfont: " #=§{{~~_\N| @' +>< Special characters: \(pl = +, \(mi = =, \(eq = =, \(** = », \(sc » §, \(aa = ", \(ga »° \N(ad =, \(sl =/, \(*a = a \(*h =8, \(®g =27, \(*d = 6, \(*e =+, \(*2 =+ ¢ \(*y » 7, \(*h = ¥, \(*i = ¢, \(*k = £, \(*L = A, \(*m - u. \(*n = v, \(*c = £ \(*0 » 0, \(*p = T, \(*r = 0, \(*s = o, \(ts » ¢, \(*t » 7, \(*u = v, \(*f » ¢, \(*x » x, \(®q = ¥. \(*W = o, \N(*A = A, \(*B =+ B, \(*G = T, \(*D = A, \(*E = E, \(*Z » Z, \(*Y = H, \(*H = 8, \(®*] » [, \(*K = K, \(*L = A, \(*M = M, \(*N = N, \(*C = Z, \(®0 = O, \(*P = II, \(*R » P, \(*S = Z, N(*T > T, \(*U->T \(®°F =+ 3 \(*X X, \("Q = ¥, \(*W > O, \(sr = v, \(rn » ~ ,\(>= » 2, \(<= = <, \(== =+ 3, \(~=» ~ \{ap = ~, \(I= = #,\(-> + », \(<~ =+ «, \(ua =+, \(da = o, \(mu - x, \(di » +, \(+= =+ £, \(cu = U, \(ca+n, \(sb =+, \(sp =2 \(ib » <, \(ip = 2, \(if » =, \(pd » 3, \(gr » ¥, \(no » -, \(is » [, \(pt = =, \(eq » =, \(no » -, \(br » \ \(dd - {. \(rh » =A\(lh » == \(bs - @) \(or = |, \(ci =+ O, \(It =), \(Ib = |, \(rt = [, \(rb =], \(lk = {, \(rk =}, \(bv = |, \(If = |, \(rf = |, \(le = [, \(rc -] If time be of all things the most precious, wasting time must be, as Poor Richard says, the greatest prodigality; since, as he elsewhere tells us, lost time is never found again. and what we call time enough, always proves little enough: Let us then up and be doing, and doing to the purpose; so by diligence shall we do more with less perplexity. This is an ezample of a sample in various fonts. 6-42 Berkeley Font Catalogue Hershey font. This is the default font for vtroff. Roman, /talic and Bold in 8, 7, 8, 9, 10, 11, 12, 14, 16, 18, 20, 22, 24, 28, and 38 point. The following examples are 10 point. If time be of all things the most precious, wasting time must be, as Poor Richard says, the greatest prodigality; since, as he elsewhere tells us, lost time is never found again; and what we call time enough, always proves little enough: Let us then up and be doing, and doing to the purpose; so by diligence shall we do more with less perplexity. as Poor Richard says, the greatest prodigalily, since, as he elsewhere tells us, lost time is never found again, and what we call time enough, always proves little enough: Let us then up and be doing, and doing fo the purpose, so by diligence shall we do more with less perplezity. If time be of all things the mcst pmcxmm wasting time must be, as Poor Richard says, the greatest prodigality; since, a he elsewhere tells us, lost time is never found again; and what we call time enough, always proves little enough: Let us then up and be doing, and doing to the purpose; so by diligence shall we do more with less P rpiemty 8 point Roman. Bald, and [{alic. 9 point Roman, Bold, and [ialic. 10 point Roman, Bold, and [talic. 11 point Roman, Bold, and [ltalic. 12 point Roman, Bold, and [talic. 14 point Roman, Bold, and [talic. 16 point Roman, Bold, and /talic. 18 point Roman, Bold, and [talic. <0 point Roman, B old and [talic. 22 point R 24- pomt < Berkeley Font Catalogue 6-43 Meteor, roman, bold, italic, 8, 10, 12 point, no 12 point italic. ABCDE FGHIJ KLMNO PQRST UVWXYZ abcde fghij klmno pqrst uvwxyz 01234 56783 1" #8% &' ():*==[]f]~~_\N|®";5+/2.>,< If time be of all things the most precious, wasting time must be, as Poor Richard says, °. greatest prodigality; since, as he elsewhere tells us, lost time is never found again; and what we call time enough, always proves little enough: Let us then up and be doing, a. doing to the purpose; so by diligence shall we do more with less perplexity. ABCDE FGHIJ KLMNO PQRST UVWZXYZ abcde fghij klmno pqrst uvwxyz 01234 567¢& 1 $SU& ()= [l N\N]O®;+/2.>,< If time be of all things the most precious, wasting time must be, as Poor Richard say. the greatest prodigality; since, as he elsewhere tells us, lost time is never found again; and what we call time enough, always proves little enough: Let us then up andc be doing, and doing to the purpose; so by diligence shall we do more with less perplexity. ABCDE FGHIJ KLMNO PQRST UVWXYZ abcde fghij klmno pqrst uvwxyz 01234 56789 P"#8% &'():s2-=2[]f{]~~_N|®';+/2.>,< If time be of all things the most precious, wasting time must be, as Poor Richard says, the greatest prodigality; since, as he elsewhere tells us, lost time is never found again; and what we call time enough, always proves little encugh: Let us then up and be doing, and doing to the purpose; so by diligence shall we do mor= with less perplexity. 6-44 Berkeley Font Catalogue Microgramma font, 10 point only 8 ABCOE FGHIJ KLMNO PQRST UVWXY abcds fghij kimna pgrat uvwxyz 01234 5678 !"#518'{]:3-'[]§}*N_\I@‘;+/?.>,< rd says, tr If time be of all things tha most precious, wasting times must be, as Poor Richa and w greatest prodigality; sincs, as he elsewhare talls us, lost tims is never found agaim; doing . wea call time sncugh, always proves littls enough: Let us then up and be doing, and the purpose; so by diligence shall we do mors witfi less perplexity. #lona font, 24 point only LHNO PORST UUBXYE abcde fghij kimno pgrst vowxyz 01234 56783 1" #$¢q°():- di~~_\N @ ; ?. >, < Philadelphia is the most pechsniffian of American cities, and thos probably leads the worla. - 3j. 1. THenchen Berkeley Font Catalogue 6-45 Nonie, roman, bold, /tal/c, 8, 10, 12 point 8 point ABC DE FGHIJ KLMNQ PQRST UVWXYZ abcde fghl] kimno pgrst uvwxyz 01234 56789 1" #8% & (J:x-a[]fj~~_\|®':1+/?2.>,< If time be of all things the most precious, wasting time must be, as Poor Richard says, the grestest prodigality: since, 23 he elsewhers tells us, lost time Is never found again; and what we call time encugh, always proves Iit: enough: Let us then up and be doing, and doing to the purpose; so by diligence shall we do more with less perplexity. ABCDE FGHIV KLMNQO PQAST UVWXYZ abcde fghl| kimno pgrst uvwxyz 01234 56789 1"38%&°()i®-a[]f]a~_\N|O®';+/?2.>,¢ If time be of ail things the most precious, wast/ng time must be, as Poor Richard says, the greetest prodigality; si. . . ag he a/sewhere telis us, lost ime /s never found agaln; and what we call time enough, always proves /ittle enout . . Let us then up and be doing, and dolng to the purpose; so by diligence shali we do more with |ess perplexity. ABCDE FGHIJ KLMNO PQRST UVWXYZ abede fghij kimno parst uvwxyz 01234 56789 1"$8% &' ()sBca[]f]a~_\N]|O®';+/?2.>,¢ If time be of all things the most precious, wasting time must be, as Poor Richard says, the greatsstprodigaii’ sincs, ag he sisewhers talis us, logt time Is never found again; and what we call time snough, always provacs littie enough: Let us then up and be doing, and doing to the purpose; so by diligencs shall we do more with 5. perplexity. 10 point ABCDE FGHIJ KLMNO PQRST UVWXYZ abcde fghij kimno pgrst uvwxyz 01234 §6788 1" #$ % & ()21 ~~_\|@';+/7.>,¢ If time be of all things the most precious, wasting time must be, as Poor Richard says, ths= greatest prodigality; since, as he elsewhere tells us, lost time is never found again; and what we call time enough, always proves little enough: Let us then up and be doing, and doing to the purpose; so by diligence shall we do more with less perplexity. ABCDE FGHIJ KLMNO PQRST UVWXYZ abcde fghi] kimno pqrst uvwxyz 01234 56789 ["#8%&'()e"-=2[]l]~~N|O@';+/?2.>,< If time be of all things the most precious, wasting time must be, as Poor Richard says, the greatest prodigality; since, as he elsewhere tells us, lost time is never found again; and w: . we call time enough, always proves little enough: Let us then up and be doing, and doing tc the purpose; so by diligence shall we do more with less perplexity. ABCDE FGHIJ KLMNO PQRST UVWXYZ abcde fghij kimno pqrst uvwxyz 01234 58789 1"#8% &' ():2==a[]f{]~~_N]O®';+/7.>,< If time be of all things the most precious, wasting time must be, as Poor Richard says, the greatest prodigaility; since, as he eisewhere tells us, lost time is never found againand what we call time enough, always proves littie enough: Let us then up and be doinc. and doing to the purpose; so by diligencs shall we do more with less perplexity. 6-46 Berkeley Font Catalogue 12 point ABCDE FGHIJ KLMNO PQRST UVWXYZ abcde fghij klmno pqrst uvwxyz 56789 01234 1" #8% &' ():x-=[]{|~~_\N|@";4/2.>,< If time be of all things the most precious, wasting time must be, as Poor Richard says, the greatest prodigality; since, as he elsewhe re tells us, lost time is never found again; and what we call time enough, always proves little enough: Let us then up and be doing, and doing to the purpose ; so by diligence shall we do more with less perplexity. ABCDE FGHIJ KLMNO PQRST UVWXY2Z abcde fghij kimno pqrst 56789 P uvwxyz 01234 HSA& ()ix-a[J{]~~_N]/7.5, @";+ < If time be of all things the most precious, wasting time must be, as Poor Richard says, the greatest prodigality; since, as he elsewhe re tells us, lost time is never found again; and what we call time enough, always proves little enough: Let us then up and be doing, and doing to the purpose ; so by diligence shall we do more with less perplexity. ABCDE FGHIJ KLMNO PQRST UVWXYZ abede fghij kimno 01234 56788 pqrst uvwxyz V"#8%&'"():*=-=[]i]~~_\N]@'";+/?2.>,< If time be of all things the most precious, wasting time must be, as Poor Richard says, the gresatest prodigality; since, as he elsewhere tells us, lost time is never found again; and what we call time enough, always proves little enough: Let us then up and be doing, and doing to the purpose; so by diligence shall we do more with less perplexity. Berkeley Font Catalogue D13 Ennlish, 8, 14, and I8 point only. 6-47 {This font i= called "sldenglish’’ on line.) SCBE FORIIELMNG FORST BY B XE abode Egittt hlovmo porst ubbergz 01234 55789 g ' fle~_N0"T te .>.< I hnt im J al[ H)‘b’ly H‘gt mmf c‘imx. imdmg &ne mM 5’& 4% fim - v;:i m fl‘re g}t&&fif 7 ;b;:-{u {o i elsefhere lells us, lost thre & mzhf foumd agasrand hat e @l tome enough, a&mm protes (it enougl: Boerg, and Yotng to tre purpose; so by digence shall tne humm?mfit&sspe 14 point T EYWXY abede fghii kimn o parst ovwxyz 01234 56788 "y : 1A~ _N@;: .>.< ¥ time be of all thinges the most precious, wasting timme must be, 3x Posvr Richard zay=, the greatest prodigality;since, as he elsewhere telle us, lost tire i¥ never found againiand what we call tme enough, alh-ays proves little erough:L et ux then vp and be doing, and doing to the purpose;ss by diligerwe shall we do mare with less perplexity. I8 point ahcae fqhq kimnoe parst uvwxyz 93234 ’3648 F 0- M- 7L< 3 tine be of zll things the most precious, wasting time must be, as Poeor Richard sauys, the greatest prodigality =ince, a= he el=pwhere tells u=s, lo=t time i= nver found again and what we call time enough, ahvays proves little enough and 1 think I'm wasting time typing all this =tuff 6-48 Berkeley Font Catalogue PIP FONT, 1& POINT ONLY, NO LOWER CASE ABCDE FEHIJ KLMNO PORST UUWRYZ PP ()e (A~ \N @4 07234 55289 7.>,K< IT CUULD PRUEABLY BE SHOLJN BY FACTS AND FIGURES THAT THERE 1S NO RISTINLTLY NATIUE AMERICAN CRIMIMAL CLASS EXCEPT CONGRESS. - MARK TWAIN Paghill femt, 18 yaint mly 3638 FRHIJ XLOEQ PERST UVWXTZ shedn fghd{ kimas peest ewrnps 1133 P #818°():Rcal]f]am \N@:0/1.>,¢ ¥ tims b of all thisgs the mest pryeime, wasting t{ms mast 1o, 15 Powr Rlebard says, thn gesatest pradlqality; sluss, s b sisswhsrs t2lls us, last time s tawr fomad againe aad what we call dime sasugh ddways prowes Uttls rasgk Lot uws thag oy aad he dsiag, 1ad daing ts 1Be jmryesa: o8 by dilgeses sdall we {0 mern with lass jerplasity. Scnpt, 18 point only. This font appears to be almost identical to the “Coronet’ font from SAIL, except that the period and one other glyph of Coronet are missing a row, and Coronetis supposed to be 16 point. (They are both really the same size.) ABCDHE FEHIF KLMNO PRQRST UVWXYZ abed. /9£¢°f A‘{mno pgrsé B w 152 01234 56789 T/ time be of all things the most preciens, wasting time mast be, as [P oor Rickard iays, the greatest prodigality; simce, as he ilwwhere telly as, lost time is never found again; and what we call time imangh, alwags proves littls imough: of ws then up and be doing, and doing te the parpese; 1o by diligence shall we ds mare wx'fA /ud F"p(/r:x'ly. Berkeley Font Catalogue 6-49 ECEEE, 46 PEIRY EOLY, OF LEMER BAEE CEEEE FECLY CLHAE PECET OVINYZ §"# * 5 EIf]~~_\N@‘s0 G425 SETES ,>,< TEE SOCEEY FERT (S 60 CREELLENT ELEIEE FEG PEEFECNE CEEMETIERS, (T 0GS TOC CEVERNTEEE GF EEIRE CLKEST COCECECELE, ¢¢ SHAGDE Mo SIGN, 22 POINT ONLY ABCDE FGHIJ KLMNO PQRST UVWXYZ *< 0123456789 g " ax-= flan_ @3/ .>, < THIS FONT WAS INVENTED BY A DRAFTSMAN WHO HAD LOST HIS FRENCH CURVE. »*SOIT GOES « LOWERCASEL IS » LOWERCASE R IS <. , 6-50 Berkeley Font Catalogue Stare hershey font. This font is identical to the hershey font except that the point sizes are cne poir smaller, and the width tables are those used for the real typesetter. Hence, this font is useful whe" previewing documents that are to be sent to a typesetter to make sure the spacing, paging, and so o = in 8, 8, 10, 11, 12, 14, and 16 point The following exampis right There are Roman, [talic and Bald are 10 point. ABCDE FGHIJ KLMNO PQRST UVW XYZ abcde fghij kimno pgrst uvwxyz 01234 56789 )~~N\N[®';+ /?2.>,< % & ():0-=[]l r"48 If ime be of all things the most precious, westing time must be, as Poor Richard says, the greatest prodigelity; since, as he elsewhere tells us, lost time is never found again; and what we call time enough, always proves little enough: Let us then up and be doing, and doing to the purpcse; so by diligence shall we do more with less perplexity. o koo parst wuzyz 01234 56789 ABCDE FGHIJ KLMNO PQRST UVW XYZ abcde 1"$sx&():-= (1} ~~N[@;+ 72.>,< says, the grealest prodigaliry; be, as Poor Richard time must g the most precious, uastin If time be of all things e lost time is newer pund agaty and uhat us call time enough, alumys proves tells us, since, as he elseuher kitle enough: Let us then wp and be dotrg, and dotng o the pwrpose;o by diligernce shall ue do more wdh less wETpley. o 01234 56789 pgrst uvwxyz ABCDE FGHIJ KLENO PQRST UVWXYZ abcde fghij klmn 1"482&°():*-=[]{}~~N[|@*';+ /72.>.,< says, the greatest be, as Poor Richard ng time must If time be of all things the most precious, wasti prodigality; since, as he dsewhere tells us, lost time is never found again; and what we call time enough, always proves little enough: Let us then up and be doing, and doing to the purpose; so by diligence shall we do more with less perplexity. 8§ point Roman, Bald and [iakic. 8 point Roman, Badd, and [k, 10 point Roman, Bold, and flale. 11 point Roman, Bold, and /il 12 point Roman, Bold, and /iii. 14 point Roman, Bold, and [tiic. 16 point Roman, Bold, and [talic. Berkeley Font Catalogue 6-51 Times fonts, roman, italic, and bold 10 point only. out These fonts showed up in a directory labelled ""timesroman’' along with three other fonts which turnedclose. pretty be to seem but fonts, times really not probably are They to be nonie, meteor, and news gothic. Notice the top of the *2” for a clear difference from a real Times Roman font. It is our desire to have a real, digitized version of the times fonts from the phototypesetter. We eventuallya plan to do this. At that point, the times font will probably replace the hershey font as the default. Such Times font is already available from Johns Hopkins University for a fee, but we couldn’t redistribute it, so we plan do digitize them ourselves. 10 Point ABCDE FGHLJ] KLMNO PORST UVWXYZ abecde fghij klmno pqrst uvwxyz 01234 56789 1" B8%& ():k-n[1{]~~N]@+/2.>,< ’9 (1"’..‘9‘1 ‘:‘*mia%v%’%vfioflogsmomsoa.rs'a;ioo Z abede fghij klmno pqrst uowzyz 01234 56789 ABCDE FCHI1J KLMNO PQRST UVWXY [18}~~_N\|@+/2.>,< 87 & ()ik-= 14 ’,',-",'.",‘,U,,V«“u"«fi,fl,fl"lfifil%T,'.P;“’ ABCDE FGHIJ KLMNO PQRST UVWXYZ abede fghij klmmo pqrst uvwxyz 01234 56789 v 482&°():x-=[1{}~~N\1@%+/2.>,< ': 3"’; ‘:')_"1..1-7,9"'41% %,fi,fl,fi,fi.,m,o,f,',}?@ UNIX Assembler Reference Manual 6-53 UNIXT Assembler Reference Manual Dennis M. Ritchie Bell Laboratories Murray Hill, New Jersey 07974 0. Introduction This document describes the usage and input syntax of the UNIX PDP-11 assembler as. The details of the pDP-11 are not described. The input syntax of the UNIX assembler is generally similar to that of the DEC assembler PAL-11R, although its internal workings and output format are unrelated. It may be useful to read the publication DEC-11-ASDB-D, which describes PAL-11R, although naturally one must use care in assuming that its rules apply to as. As is a rather ordinary assembler without macro capabilities. It produces an output file that contains relocation information and a complete symbol table; thus the output is acceptable to the UNIX link-editor /d, which may be used to combine the outputs of several assembler runs and to obtain object programs from libraries. The output format has been designed so that if a program contains no unresolved references to external symbols, it is executable without further processmg 1. Usage | as is used as follows: as [ —u] [ —ooutpur] file,. If the optional ““—u”’ argument is given, all undefined symbols in the current assembly will be made undefined-external. See the .globl directive below. The other arguments name files which are concatenated and assembled. may be written in several pieces and assembled together. Thus programs The output of the assembleris by default placed on the file a.our in the current directory; the **—o’" flag causes the output to be placed on the named file. If there were no unresolved external references, and no errors detected, the output file is marked executable; otherwise, if it is produced at all, it is made non-executable. 2. Lexical conventions Assembler tokens include identifiers (alternatively, ‘‘symbols’’ or ‘‘names’’), temporary symbols, constants, and operators. 2.1 Identifiers An identifier consists of a sequence of alphanumeric characters (including period “.”", underscore *‘_"", and tilde ‘TM"’ as alphanumeric) of which the first may not be numeric. Only the first eight characters are significant. When a name begins with a tilde, the tilde is discarded and that occurrence of the identifier generates a unique entry in the symbol table which can match no other occurrence of the identifier. This feature is used by the C compiler to place 6-54 UNIX Assembler Reference Manual names of local variables in the output symbol table without having to worry about making them unique. 2.2 Temporary symbols A temporary symbol consists of a digit followed by *f>* or *‘b’’. Temporary symbols are discussed fully in §5.1. 2.3 Constants An octal constant consists of a sequence of digits; ‘‘8’" and ‘‘9’’ are taken to have octal value 10 and 11. The constant is truncated to 16 bits and interpreted in two’s complement notation. A decimal constant consists of a sequence of digits‘ terminated by a decimal point 99 66 L] [ The magnitude of the constant should be representable in 15 bits; i.e., be less than 32,768. A single-character constant consists of a single quote ‘"’ followed by an ASCII character not a new-line. Certain dual-character escape sequences are acceptable in place of the ASClI character to represent new-line and other non-graphics (see String sitatements, §5.5). The constant’s value has the code for the given character in the least significant byte of the word and is null-padded on the left. A double-character constant consists of a double quote ‘“"*’ followed by a pair of ASCII characters not including new-line. Certain dual-character escape sequences are acceptable in place of either of the ASCII characters to represent new-line and other non-graphics (see String statements, §5.5). The constant’s value has the code for the first given character in the least significant byte and that for the second character in the most significant byte. 2.4 Operators There are several single- and double-character operators; see §6. 2.5 Blanks Blank and tab characters may be interspersed freely between tokens, but may not be used within tokens (except character constants). A blank or tab is required to separate adjacent identifiers or constants not otherwise separated. 2.6 Comments The character *‘ /"’ introduces a comment, which extends through the end of the line on which it appears. Comments are ignored by the assembiler. 3. | Segments Assembled code and data fall into three segments: the text segment, the data segment, and the bss segment. The text segment is the one in which the assembler begins, and it is the one into which instructions are typically placed. The UNIX system will, if desired, enforce the purity of the text segment of programs by trapping write operations into it. Object programs produced by the assembler must be processed by the link-editor /d (using its *“‘—n"’ flag) if the text segment is to be write-protected. A single copy of the text segment is shared among all processes executing such a program. The data segment is available for placing data or instructions which will be modified during execution. Anything which may go in the text segment may be put into the data segment. In programs with write-protected, sharable text segments, data segment contains the initialized but variable parts of a program. If the text segment is not pure, the data segment begins immediately after the text segment; if the text segment is pure, the data segment begins at the lowest 8K byte boundary after the text segment. LA 2 - L e The bss segment may not contain any explicitly initialized code or data. The length of the UNIX Assembler Reference Manual 6-55 bss segment (like that of text or data) is determined by the high-water mark of the location counter within it. The bss segment is actually an extension of the data segment and begins immediately after it. At the start of execution of a program, the bss segment is set to 0. Typically the bss segment is set up by statements exemplified by lab: . = ,+10 The advantage in using the bss segment for storage that starts off empty is that the initialization information need not be stored in the output file. See also Location counter and Assignment statements below. 4. The location counter One special symbol, ‘“.”, is the location counter. Its value at any time is the offset within the appropriate segment of the start of the statement in which it appears. The location counter may be assigned to, with the restriction that the current segment may not change; furthermore, the value of *“.”’ may not decrease. If the effect of the assignment is to increase the value of ‘“.”’, the required number of null bytes are generated (but see Segments above). 9 5. Statements A source program is composed of a sequence of sratements. Statements are separated either by new-lines or by semicolons. There are five kinds of statements: null statements, expression statements, assignment statements, string statements, and keyword statements. Any kind of statement may be preceded by one or more labels. 5.1 Labels There are two kinds of label: name labels and numeric labels. A name label consists of a name followed by a colon (:). The effect of a name label is to assign the current value and type of the location counter **.’’ to the name. An error is indicated in pass | if the name is already defined; an error is indicated in pass 2 if the *“.”’ value assigned changes the definition of the label. A numeric label consists of a digit 0 to 9 followed by a colon (:). Such a label serves to define temporary symbols of the form *“‘nb’’ and ‘““af”’, where n is the digit of the label. As in the case of name labels, a numeric label assigns the current value and type of ‘.’ to the temporary symbol. However, several numeric labels with the same digit may be used within the same assembly. References of the form "‘nf” refer to the first numeric label **n:" forward from the reference; ‘‘nb’’ symbols refer to the first ‘“n >’ label backward from the reference. This sort of temporary label was introduced by Knuth [The Art of Computer Programming, Vol I: Fundamental Algorithms]. Such labels tend to conserve both the symbol table space of the assembler and the inventive powers of the programmer. 5.2 Null statements A null statement is an empty statement (which may, however, have labels). A null statement is ignored by the assembler. Common examples of null statements are empty lines or lines containing only a label. 5.3 Expression statements word. An expression statement consists of an arithmetic expression not beginning with a keyThe assembler computes its (16-bit) value and places it in the output stream, together with the appropriate relocation bits. 6-56 UNIX Assembler Reference Manual 5.4 Assignment statements An assignment statement consists of an identifier, an equals sign ( =), and an expression. The value and type of the expression are assigned to the identifier. It is not required that the type or value be the same in pass 2 as in pass 1, nor is it an error to redefine any symbol by assignment. Any external attribute of the expression is lost across an assignment. This means that it is not possible to declare a global symbol by assigning to it, and that it is impossible to define a symbol to be offset from a non-locally defined global symbol. As mentioned, it is permissible to assign to the location counter ¢ .’’. It is required, however, that the type of the expression assigned be of the same type as ‘“.”’, and it is forbidden to decrease the value of *‘.’’. In practice, the most common assignment to ‘“.’’ has the form ‘. = .+ n” for some number n, this has the effect of generating » null bytes. 5.5 String statements A string statement generates a sequence of bytes containing ASCIH characters. A string statement consists of a left string quote ‘<’ followed by a sequence of ASCIHI characters not including newline, followed by a right string quote ‘“>'’. Any of the ASCII characters may be replaced by a two-character escape sequence to represent certain non-graphic characters, as fol- lows: \n \s \t \e \0 \r \a NL SP HT EOT ~NuL CR ACK \p \\ PFX \ \> (012) (040) (011) (004) (000) (015) (006) (033) > The last two are-included so that the escape character and the right string quote may be represented. The same escape sequences may also be used within single- and double-character constants (see §2.3 above). 5.6 Keyword statements Keyword statements are numerically the most common type, since most machine instruc- tions are of this sort. A keyword statement begins with one of the many predefined keywords of the assembler; the syntax of the remainder depends on the keyword. All the keywords are listed below with the syntax they require. 6. Expressions An expression is a sequence of symbols representing a identifiers, constants, temporary symbols, operators, and brackets. value. Its constituents are Each expression has a type. All operators in expressions are fundamentally binary in nature; if an operand is missing on the left, a 0 of absolute type is assumed. Arithmetic is two’s complement and has 16 bits of precision. All operators have equal precedence, and expressions are evaluated strictly left to right except for the effect of brackets. UNIX Assembler Reference Manual 6-57 6.1 Expression operators The operators are: (blank) when there is no operand between operands, the effect is exactly the same as if a *“‘+”’ had appeared. + addition - subtraction * multiplication \/ division (note that plain *‘/ "’ starts a comment) 8 bitwise and ] bitwise or \> logical right shift \ < logical left shift % modulo ! ) a'bis a or (not b); i.e., the or of the first operand and the one’s complement of the second; most common use is as a unary. result has the value of first operand and the type of the second; most often used to define new machine instructions with syntax identical to existing instructions. Expressions may be grouped by use of square brackets ‘“[]’’. reserved for address modes.) (Round parentheses are 6.2 Types The assembler deals with a number of types of expressions. Most types are attached to keywords and used to select the routine which treats that keyword. The types likely to be met explicitly are: undefined Upon first encounter, each symbol is undefined. It may become undefined if it is assigned an undefined expression. It is an error to attempt to assemble an undefined expression in pass 2; in pass 1, it is not (except that certain keywords require operands which are not undefined). undefined external A symbol which is declared .globl but not defined in the current assembly is an undefined external. If such a symbol is declared, the link editor /d must be used to load the assembler’s output with another routine that defines the undefined reference. absolute An absolute symbol is defined ultimately from a constant. Its value is unaffected by any possible future applications of the link-editor to the output file. text The value of a text symbol is measured with respect to the beginning of the text segment of the program. If the assembler output is link-edited, its text symbols may change in value since the program need not be the first in the link editor’s output. Most text symbols are defined by appearing as labels. At the start of an assembly, the value of **." is text 0. data The value of a data symbol is measured with respect to the origin of the data segment of a program. Like text symbols, the value of a data symbol may change during a sub- sequent link-editor run since previously loaded programs may have data segments. After the first .data statement, the value of *“." is data 0. bss The value of a bss symbol is measured from the beginning of the bss segment of a program. Like text and data symbols, the value of a bss symbol may change during a subsequent link-editor run, since previously loaded programs may have bss segments. After the first .bss statement, the value of *“.” is bss 0. 6-58 UNIX Assembler Reference Manual external absolute, text, data, or bss symbols declared .globl but defined within an assembly as absolute, text, data, or bss symbols may be used exactly as if they were not declared .globl; however, their value and type are available to the link editor so that the program may be loaded with others that reference these symbols. register The symbols r0 ... r5 fr0 ... frS Sp pc are predefined as register symbols. Either they or symbols defined from them must be used to refer to the six general-purpose, six floating-point, and the 2 special-purpose machine registers. The behavior of the floating register names is identical to that of the corresponding general register names; the former are provided as a mnemonic aid. other types Each keyword known to the assembler has a type which is used to select the routine which processes the associated keyword statement. The behavior of such symbols when not used as keywords is the same as if they were absolute. 6.3 Type propagation in expressions When operands are combined by expression operators, the result has a type which depends on the types of the operands and on the operator. The rules involved are complex to state but were intended to be sensible and predictable. For purposes of expression evaluation the important types are undefined absolute text data bss undefined external other The combination rules are then: If one of the operands is undefined, the result is undefined. both operands are absolute, the result is absolute. If If an absolute is combined with one of the “‘other types’’ mentioned above, or with a register expression, the result has the register or As a consequence, one can refer to r3 as ‘‘rO+3"". If two operands of ‘‘other other type. type’’ are combined, the result has the numerically larger type An ‘“‘other type’’ combined with an explicitly discussed type other than absolute acts like an absolute. Further rules applying to particular operators are: + If one operand is text-, data-, or bss-segment relocatable, or is an undefined external, the result has the postulated type and the other operand must be absolute. — If the first operand is a relocatable text-, data-, or bss-segment symbol, the second operand may be absolute (in which case the result has the type of the first operand); or the second operand may have the same type as the first (in which case the result is abso- lute). If the first operand is external undefined, the second must be absolute. All other combinations are illegal. This operator follows no other rule than that the result has the value of the first operand and the type of the second. UNIX Assembler Reference Manual 6-59 others It is illegal to apply these operators to any but absolute symbols. 7. Pseudo-operations The keywords listed below introduce statements that generate data in unusual forms or influence the later operations of the assembler. The metanotation [stuff ] ... means that 0 or more instances of the given stuff may appear. literals, italic words are substitutable. 7.1 .byte expression [ , expression] Also, boldface tokens are ... The expressions in the comma-separated list are truncated to 8 bits and assembled in suc- cessive bytes. The expressions must be absolute. This statement and the string statement above are the only ones that assemble data one byte at at time. 7.2 .even If the location counter ““.”" is odd, it is advanced by one so the next statement will be assembled at a word boundary. 1.3 | .if expression The expression must be absolute and defined in pass 1. If its value is nonzero, the .if is ignored; if zero, the statements between the .if and the matching .endif (below) are ignored. .if may be nested. The effect of .if cannot extend beyond the end of the input file in which it appears. (The statements are not totally ignored, in the following sense: .ifs and .endifs are scanned for, and moreover all names are entered in the symbol table. Thus names occurring only inside an .if will show up as undefined if the symbol table is listed.) 7.4 .endif This statement marks the end of a conditionally-assembled section of code. See .if above. 7.5 .globl name [ , name | ... This statement makes the names external. If they are otherwise defined (by assignment or appearance as a label) they act within the assembly exactly as if the .globl statement were not given, however, the link editor /4 may be used to combine this routine with other routines that refer these symbols. Conversely, if the given symbols are not defined within the current assembly, the link editor can combine the output of this assembly with that of others which define the symbols. As discussed in §1, it is possible to force the assembler to make all otherwise undefined sym- bols external. 7.6 .text 7.7 .data 7.8 .bss These three pseudo-operations cause the assembler to begin assembling into the text, Assembly starts in the text segment. It is forbidden to data, or bss segment respectively. assemble any code or data into the bss segment, but symbols may be defined and ‘.’ moved about by assignment. 6-60 UNIX Assembler Reference Manual 7.9 .comm name , expression Provided the name is not defined elsewhere, this statement is equivalent to .globl name name = expression ~ name That is, the type of name is ‘‘undefined external’, and its value is expression. In fact the name behaves in the current assembly just like an undefined external. However, the link-editor /4 has been special-cased so that all external symbols which are not otherwise defined, and which have a non-zero value, are defined to lie in the bss segment, and enough space is left after the symbol to hold expression bytes. All symbols which become defined in this way are located before all the explicitly defined bss-segment locations. 8. Machine instructions Because of the rather complicated instruction and addressing structure of the pDP-11, the syntax of machine instruction statements is varied. Although the following sections give the syntax in detail, the machine handbooks should be consulted on the semantics. 8.1 Sources and Destinations The syntax of general source and destination addresses is the same. Each must have one of the following forms, where reg is a register symbol, and expr is any sort of expression: syntax words mode reg 0 00+ reg (reg) + 0 20+ reg - (reg) 0 40+ reg expr (reg) 1 60+ reg (reg) 0 10+ reg *reg 0 10+ reg *(reg) + 0 30+reg * — (reg) *(reg) 0 1 50+ reg 70+ reg *expr (reg) 1 70+ reg expr | 67 $expr 1 27 * expr 1 77 * Sexpr ] 37 The words column gives the number of address words generated; the mode column gives the octal address-mode number. The syntax of the address forms is identical to that in DEC assemblers, except that ““*”> has been substituted for ““@’’ and *‘$”’ for ‘‘#’’; the UNIX typing conventions make ‘@’ and ‘‘#’’ rather inconvenient. Notice that mode ‘‘*reg’’ is identical to ‘‘(reg)’’; that ‘“‘*(reg)’’ generates an index word (namely, 0); and that addresses consisting of an unadorned expression are assembled as pcrelative references independent of the type of the expression. To force a non-relative refer- ence, the form ***Sexpr’’ can be used, but notice that further indirection is impossible. 8.3 Simple machine instructions The following instructions are defined as absolute symbols: UNIX Assembler Reference Manual 6-61 clc clv clz cln sec sey sez sen They therefore require no special syntax. The PDP-11 hardware allows more than one of the ““clear’’ class, or alternatively more than one of the ‘‘set’’ class to be or-ed together; this may be expressed as follows: cle | clv 8.4 Branch The following instructions take an expression as operand. The expression must lie in the same segment as the reference, cannot be undefined-external, and its value cannot differ from the current location of ‘‘.’* by more than 254 bytes: br bne beq bge blos bve bvs bhis bgt ble bpl bce blo bes bt bmi bec (= bce) bes (= bcs) bhi bes (‘‘branch on error set’’) and bec (‘‘branch on error clear’’) are intended to test the error bit returned by system calls (which is the c-bit). 8.5 Extended branch instructions The following symbols are followed by an expression representing an address in the same segment as ‘“.”’. If the target address is close enough, a branch-type instruction is generated; 1f the address is too far away, a jmp will be used. jbr jne jeq jge jlos jve jvs jlt jhis jec jet jec jle jpl jmi jlo jes jes jhi jbr turns into a plain jmp if its target is too remote; the others (whose names are contructed by replacing the ‘b’ in the branch instruction’s name by ‘‘j"’) turn into the converse branch over a jmp to the target address. 6-62 UNIX Assembler Reference Manual 8.6 Single operand instructions The following symbols are names of single-operand machine instructions. address expected is discussed in §8.1 above. clr 8.7 The form of sbcb clrb ror com rorb comb rol inc rolb incb asr dec asrb decb asl neg aslb negb jmp adc swab adcb tst sbc tstb Double operand instructions The following instructions take a general source and destination (§8.1), separated by a comma, as operands. mov movb cmp cmpb bit bitb bic bicb bis bisb add sub 8.8 Miscellaneous instructions The following instructions have more specialized syntax. Here reg is a register name, src and dst a general source or destination (§8.1), and expr is an expression: jsr reg,dst rts reg SYS expr ash src., reg ashc src, reg (or, als) (or, alsc) mul div src, reg src, reg (or, mpy) (or, dvd) xor reg ., dst sxt dst mark expr sob reg, expr sys is another name for the trap instruction. required to be expressible in 6 bits. and the expression in sob must It is used to code system calls. Its operand is The expression in mark must be expressible in six bits, be in the same segment as ‘*."’, must undefined, must be less than ‘‘.”’, and must be within 510 bytes of **."". %9 not be external- UNIX Assembler Reference Manual 6-63 8.9 Floating-point unit instructions The following floating-point operations are defined, with syntax as indicated: cfce setf setd seti setl cirf fdst negf /[dst absf /fdst tstf [src movf fsrc, freg (= Idf) movf freg, fdst (= stf) movif src, freg (= Idcif) movft freg, dst (= stcfi) movof fsrc, freg (= ldcdf) movfo freg, fdst (= stcfd) movie src, freg (= ldexp) movei freg, ds (= stexp) addf fsrc, freg subf fsrc, freg mulf fsrc, freg divf fsrc, freg cmpf fsre, freg modf fsrc, freg ldfps src stfps dst stst dst fsre, fdsi, and freg mean floating-point source, destination, and register respectively. Their syntax is identical to that for their non-floating counterparts, but note that only floating registers 0-3 can be a freg. The names of several of the operations have been changed to bring out an analogy with certain fixed-point instructions. The only strange case is movf, which turns into either stf or ldf depending respectively on whether its first operand is or is not a register. the floating condition codes, stf does not. 9. Warning: ldf sets Other symbols 9.1 The symbol ““..”" is the relocation counter. Just before each assembled word is placed in the output stream, the current value of this symbol is added to the word if the word refers to a text, data or bss segment location. If the output word is a pc-relative address word that refers to an absolute location, the value of *“..’" is subtracted. Thus the value of *‘..”" gram. » 99 can be taken to mean the starting memory location of the proThe initial value of **..”" is 0. The value of ‘*.."" L] may be changed by assignment. Such a course of action is sometimes necessary, but the consequences should be carefully thought out. It is particularly ticklish to change **.."" midway in an assembly or to do so in a program which will be treated by the loader, which has its own notions of *“.." 6-64 UNIX Assembler Reference Manual 9.2 System calls System call names are not predefined. They may be found in the file Jusr/includelsys.s 10. Diagnostics When an input file cannot be read, its name followed by a question mark is typed and assembly ceases. When syntactic or semantic errors occur, a single-character diagnostic is typed out together with the line number and the file name in which it occurred. Errors in pass 1 cause cancellation of pass 2. The possible errors are: ) parentheses error parentheses error > string not terminated properly . indirection (*) used illegally illegal assignment to ** . XCARID9OZT~"OTMMmMmw?)P* ] error in address branch address is odd or too remote error in expression error in local (“‘f”* or *‘b”") type symbol garbage (unknown) character end of file inside an .if multiply defined symbol as label word quantity assembled at odd address phase error— *‘.”’ different in pass 1 and 2 relocation error undefined symbol syntax error UNIX MASTER INDEX Index -1 The UNIX Master Index is a cumulative index; it brings together the indexes of all the UNIX volumes. The Master Index appears at the end of each volume. Each entry is followed by one or more shortened volume titles, indicating the volumes in which the topic is discussed and the pages containing the information. The volumes and their shortened titles are shown in the following table: Shortened Volume Title General use GEN Programming PGM System manager SYS If a topic is discussed in two or more volumes, the shortened volume names are presented in alphabetical order. For example, an entry in the Master Index might appear in the following way: ed line editor description, GEN 4-8 to 4-9, SYS 4-6 ed__.hup file saving text, GEN 2-6 This entry indicates that a description of the ed line editor can be found on pages 4-8 through 4-9 of the GEN volume and page 4-6 of the SYS volume. The ed__.hup file is discussed on page 3-43 of the GEN volume. ACRONYMS AND MNEMONICS The acronym (or mnemonic) is the preferred entry. The acronym is crossreferred from the complete form. DEFINITIONS Defined terms and glossary terms are indexed. HOMONYMS Things of the same name but different meaning are followed by a descriptive word or by an abbreviation in parentheses. KEYS FOR EXAMPLES, FIGURES, TABLES, AND FOOTNOTES Page references for example, figure, and table index entries are keyed. Exam- ple: Example 4-13E Figure 4-13F Table 4-13T Footnote 4-13n ii-Index NONALPHABETIC CHARACTERS Entries containing leading nonalphabetic characters (symbols, numbers, and punctuation) are placed at the beginning of the index. Nonalphabetic characters within index entries are sorted before alphabetic characters. Nonalphabetic characters that serve as terms are indexed in a spelled-out form whenever possible. © ) INDEX command (DC) @ descripton, GEN 2-58 command (ed) escaping to use UNIX command, ® —d GEN 3-51E command (ex) @ i description, GEN 3-95 command (Mail) marking commands for the shell, © —ts GEN 2-28 escape (Mail) description, GEN 2-25 $ character (ed) printing last line, GEN 3-28 % command (DC) descripton, GEN 2-57 % prompt defined, GEN 3-5 & command (ex) description, GEN 3-96 + command (DC) descripton, GEN 2-57 - command (DC) descripton, GEN 2-57 - command (Mail) printing previous message, GEN 2-28 .. file defined, GEN 4-63 /etc/passwd file defined, GEN 4-66 /ete/rec command file starting network servers, SYS 5-49 /sys directory contents, SYS 5-36T /sys/sys directory file prefixes, SYS 5-36T /usr/spool/mail directory system mailbox and, GEN 2-17 0 command defined, GEN 5-88 0 command (troff) right-justifying digits, GEN 5-87 0 macro (me) specifying section titles for contents, GEN 5-41 1822 interface See imp network interface driver lc command (me) defined, GEN 5-43 returning one-column format, GEN 5-35 1C command (ms) returning one-column format, GEN 5-6 2¢ command (me) defined, GEN 5-43 specifying two-column format, GEN 5-35 2C command (ms) specifying two-column format, GEN 5-6 Index-1 3Com Ethernet controller > symbol meaning, GEN 2-10 See ec network interface driver ? escape (Mail) 4.2BSD file system file set, SYS 5-32T description, GEN 2-26 4.2BSD Interprocess Communication Primer See also Interprocess [...] pattern-matching and, GEN 2-8 \* command (troff) entering comments in macros, communication GEN 5-89 4.2BSD Interprocess Communication Primer, SYS 3-5 to 3-28 __exit function 4.2BSD Line Printer Spooler description, PGM 1-8 Manual, PGM 4-99 to 4-105 See also Line printer spooling A system (4.2BSD) 4.2BSD system 4.1BSD files and, SYS 5-32 to 4.1BSD language processors and, SYS 5-34 SYS 5-88 adding users, SYS 5-43 SYS 5-3 to configuring for networking support, SYS 5-47 to 5-51 SYS 5-48 disk space and, SYS 5-18 distribution format, SYS 5-18 hardware supported, SYS 5-17 installing on VAX/VMS, SYS 5-17 to 5-71 setting up, SYS 5-35 to 5-46 source directory organization, SYS SYS 5-43 upgrading, SYS 5-32 to 5-34 4.2BSD System Manual, PGM 4-15 description, GEN 2-63 < symbol meaning, GEN 2-10 = command (sed) defined, GEN 3-114 Index-2 defined, GEN 3-80 a option (hunt) defined, GEN 5-148 a option (inv) a option (troff) defined, GEN 5-50 a.out file as assembler and, GEN 6-53 5-89T system manual, PGM 4-15 to 4-52 : command (DC) a command (vi) defined, GEN 5-147 making boot cassette, SYS 5-35 description, GEN 2-25 A command (vi) defined, GEN 3-78 creating boot floppy, SYS 5-35 : escape (Mail) See also 1 command (sed) defined, GEN 3-108 configuring multiple networks, description, GEN 2-63 defined, GEN 5-46 a command (sed) 5-15 to 4-52 description, GEN 3-88 A command (me) 1-21 : command (DC) entering, GEN 3-6E a command (ex) bug fixes and changes, SYS 1-3 to tailoring to your site, using, GEN 3-25 to 3-26 a command (edit) adding device drivers, changes to the kernel, a command (ed) defined, GEN 3-34 5-34 defined, GEN 4-63 aardvark game 4.2BSD and, SYS 1-17 ab command (ex) See also una command (ex) description, GEN 3-87 AB command (me) defined, GEN 5-46 AB command (ms) entering abstract in text, GEN 5-5 ab command (nroff/troff) message output, GEN 5-81 abbreviate command (ex) See ab command (ex) Address (sed) abort command (Ipc) description, PGM 4-103 See also Relative pathname See arp driver addstr routine defined, GEN 4-63 defined, PGM 4-81 description, GEN 4-33 Advisory lock Abstract entering with -ms, GEN 5-5 compared to hard lock, SYS 1-33 AE command (ms) ac command (me) defined, GEN 5-46 description, GEN 3-107 to 3-108 Address Resolution Protocol Absolute pathname | ACC LH/DH IMP interface See acc network driver acc network driver 4.2BSD improvement, SYS 1-15 TL command and, GEN 5-6 af command (nroff/troff) defined, GEN 5-66 Aho, A.V., & others awk programming language, PGM Accent creating with troff, GEN 5-88E entering with -ms, GEN 5-9 3-5 to 3-12 Al command (ms) formatting author’s institution name, GEN 5-5 new in -ms, GEN 5-19 access system call Alias 4.2BSD improvement, SYS 1-10 defined, GEN 2-21, 2-38, 4-63 ACM (Association for Computing removing from shell, GEN 4-52 Machinery) formatting papers for, GEN 5-46 acommute routine operators and, PGM 2-67 to 2-68 Action statement (awk) description, PGM 3-7 to 3-9 Active system defined, SYS 5-123 Acute accent See Metacharacters ad command (nroff/troff) defined, GEN 5-61 j register and, GEN 5-81 ad driver 4.2BSD improvement, SYS 1-15 ad.c device driver 4.2BSD improvement, SYS 5-12 ADB debugging program 4.2BSD improvement, SYS 1-5 C and, GEN 2-15 description, PGM 3-51 to 3-77 addbib utility See also refer description, SYS 1-5 addch routine defined, PGM 4-80 Addition DC and, GEN 2-60 Additive operator description, GEN 2-53 Address (edit) defined for buffer line, GEN 3-18 specifying, GEN 2-21 alias command (C shell) See also unalias command (C shell) displaying aliases, GEN 4-50E alias command (Mail) See also alternates command (Mail) See also metoo option defining an alias, GEN 2-21 description, GEN 2-29 restriction, GEN 2-21 alias facility | shell command files and, GEN 4-43 startup and, GEN 4-44 uses for, GEN 4-43 to 4-44 aliens game distribution and, SYS 1-17 Allman, E. -Me Reference Manual, GEN 5-39 to 5-48 introduction to SCCS, PGM 3-23 to 3-37 sendmail, SYS 3-59 to 3-71 Sendmail Installation and Operation Guide, SYS 2-27 to 2-60 writing papers with nroff using -me, GEN 5-21 to 5-38 Allocator description, GEN 2-59 to 2-60 design rationale, GEN 2-63 Index-3 ALT key See ESCAPE key alternates command (Mail) Argument (C shell) (Cont.) expanding, GEN 4-60 to 4-61 Argument (nroff) description, GEN 2-29 defined, GEN 5-21 am command (nroff/troff) argv variable (C shell) defined, GEN 4-63 defined, GEN 5-64 AM macro diacritical marks and, GEN 5-19 Ampersand character (C shell) background jobs and, GEN 4-45 script files and, GEN 4-53 Arithmetic expression (troff) entering, GEN 5-92 Arithmetic language routing output, GEN 4-44 See BC language Ampersand character (ed) Arnold, K.C.R.C. meaning, GEN 3-42 printing, GEN 3-42 s command and, GEN 3-33 to 3-34 turning off, GEN 3-34 uses, GEN 3-42 Ampersand character (edit) repeating s command, GEN 3-20 Ampersand character (shell) multitasking and, GEN 1-29 ANAME operator (C compiler) defined, PGM 2-65 ANSI Standard X3.9 1978 exceptions to, PGM 2-88 extensions, PGM 2-82 to 2-83 append command (ed) See a command (ed) append command (edit) See a command (edit) append command (ex) See a command (ex) Append mode See Input mode append option (Mail) defined, GEN 2-34 Appendix specifying page numbers, GEN 5-46 apply program description, SYS 1-5 ar 4.2BSD improvement, SYS 1-5 ar command (me) defined, GEN 5-44 Arabic number setting page number, GEN 5-44 arff program 4.2BSD improvement, SYS 1-18 args command (ex) description, GEN 3-88 Argument (C shell) defined, GEN 4-63 Index-4 Screen package, PGM 4-75 to 4-98 Arnold, K.C.R.C., & Toy, M.C. guide to the dungeons of doom, GEN 6-17 to 6-25 arp driver 4.2BSD improvement, SYS 1-15 ARPA File Transfer Protocol ftp program and, SYS 1-6 ARPA Telnet protocol See telnet program ARPANET sending mail to, GEN 2-26 UUCP network and, GEN 2-26 Array (awk) description, PGM 3-9 Array element defined, GEN 2-51 Array identifier description, GEN 2-50 as assembler command line format, GEN 6-53E defined, GEN 6-53 errors, GEN 6-64 reference manual, GEN 6-53 to 6-64, PGM 4-53 to 4-65 segment types, GEN 6-54 as command (nroff/troff) defined, GEN 5-64 ask option (Mail) defined, GEN 2-34 prompting for subject header, GEN 2-20 setting, GEN 2-20 askcc option (Mail) defined, GEN 2-34 asm.sed file 4.2BSD improvement, SYS 5-13 Assembler replacing, SYS 5-118 Assignment operator description, GEN 2-53 autowrite option (ex) Assignment statement (as) description, GEN 3-98 defined, GEN 6-56 Assignment statement (BC) awk programming language command line format, PGM 3-5 value and, GEN 2-48 compared with grep, PGM 3-5 Association for Computing defined, GEN 2-13, PGM 3-5 Machinery description, PGM 3-5 to 3-12 See ACM design, PGM 3-9 to 3-10 Asterisk character execution time compared, PGM dot character and, GEN 3-40 3-12T ed and, GEN 3-33 printing multiple files, GEN 2-8 fields, PGM 3-5 shell and, GEN 4-33 implementation, PGM 3-10 turning off, GEN 2-8 printing output, PGM 3-6 uses, GEN 3-40 to 3-41 program structure, PGM 3-5 zero and, GEN 3-41 records, PGM 3-5 uses, PGM 3-10 Asymmetric protocol variables, PGM 3-8 defined, SYS 3-17 At sign See also CTRL-H B See also u command (edit) deleting a line, GEN 3-8E B command (me) entering in text, GEN 2-4 erasing characters on input line, printing, GEN 3-39 GEN 5-33 AU command (ms) b command (me) formatting author’s name in text, entering, GEN 5-26 Author institution specifying bold font, GEN 5-36 formatting in text, GEN 5-5 specifying fill mode, GEN 5-26 Author name B command (ms) formatting in text, GEN 5-5 specifying boldface, GEN 5-8 Auto array b command (sed) specifying, GEN 2-54 auto statement (BC) forming, GEN 2-55 defined, GEN 3-114 b command (troff) creating large brackets, GEN autoconf.c file 4.2BSD improvement, SYS 5-13 SYS 5-73 to 5-105 “hardware devices and, SYS 5-75 requirements for VAX/VMS, SYS 5-95 description, GEN 3-97 autoindent option (vi) enabling, GEN 3-67 lisp and, GEN 3-68 using, GEN 3-73 autoprint option (ex) description, GEN 3-98 5-88E B command (vi) Autoconfiguration autoindent option (ex) See also rh command (me) defined, GEN 5-42, 5-44 GEN 5-5 building systems with config, defined, GEN 5-46 specifying bibliographic section, GEN 2-4 defined, GEN 3-78 b command (vi) defined, GEN 3-80 B flag (tar) reading block records, SYS 1-9 writing block records, SYS 1-9 b option (troff) defined, GEN 5-50 B__CALL flag 4.2BSD improvement, SYS 5-6 ba command (me) defined, GEN 5-45 backgammon game autoprint option (Mail) See also teachgammon program defined, GEN 2-34 4.2BSD improvement, SYS 1-17 Index-5 Background command (C shell) defined, GEN 4-63 Background job description, GEN 4-45 to 4-48 reading input from terminal, GEN 4-47K suspending, GEN 4-46 Backslash character erasing, GEN 2-4 Backslash character (ed) context search and, GEN 3-43 restriction, GEN 3-33 searching for, GEN 3-39E special characters and, GEN 3-39 Backslash character (troff) translating for typesetter, GEN 5-86 Backus Functional Programming Language See FP programming language Bad block forwarding support, SYS 1-18 bad144 program 4.2BSD improvement, SYS 1-18 Baden, S. Berkeley FP User Manual, PGM 2-359 to 2-391 badsect program See also fsck program 4.2BSD improvement, SYS 1-18 Base (BC) See also ibase; obase description, GEN 2-44 to 2-45 be command (me) defined, GEN 5-43 starting a column, GEN 5-35 BC language C language and, GEN 2-43 defined, GEN 2-43 description, GEN 2-43 to 2-55 displaying library of math functions, GEN 2-49 output bases and, GEN 2-45 restriction, GEN 2-43 simple computations and, GEN 2-43 to 2-44 subscript restriction, GEN 2-46 BC program exiting, GEN 2-49 bemp library routine 4.2BSD improvement, SYS 1-14 bcopy library routine 4.2BSD improvement, SYS 1-14 bd command (troff) defined, GEN 5-59 BDATA operator (C compiler) defined, PGM 2-64 beautify option (ex) description, GEN 3-98 BEGIN/END pattern description, PGM 3-6 Bell character printing, GEN 3-37 Benson-Varian printer output filters and, PGM 4-102 Berkeley font catalogue, GEN 6-27 to 6-51 Berkeley FP User’s Manual, PGM 2-359 to 2-391 See also FP programming language Berkeley network See Berknet Berkeley Pascal programming language user’s manual, PGM 2-159 to 2-209 Berkeley Pascal User Manual See also Pascal programming language Berkeley Pascal User Manual, PGM 2-159 to 2-209 Berkeley system See UNIX Operating System Berkeley VAX/UNIX Assembler Reference Manual, PGM 4-53 to 4-65 See also as assembler Berknet sending mail to, GEN 2-27 bg command (C shell) continuing background jobs, GEN 4-46E defined, GEN 4-64 running suspended job in background, GEN 4-47 bi command (me) defined, GEN 5-44 Bibliographic citations formatting, GEN 2-13, 5-18, 5-33 specifying, GEN 5-34F Bibliographic databases See roffbib program, SYS 1-8 Bibliography See Bibliographic citations bin directory defined, GEN 4-64 Index-6 Bourne shell (Cont.) Binary date Mail program and, GEN 2-37 Binary operator (C compiler) description, PGM 2-66 Binary option (Mail) command substitution and, GEN 4-18 to 4-20 command syntax, GEN 4-3 defined, GEN 4-3 See Option (Mail) description, GEN 4-3 to 4-27 bind system call error handling, GEN 4-21 assigning socket name, SYS 3-7E error signals, GEN 4-21F binding names to sockets, SYS fault handling, GEN 4-21 group set and, SYS 1-8 1-10 specifying association, SYS 3-25 Bit mask prompt, GEN 4-6 creating, SYS 3-11 redirecting input, GEN 4-4 bl command (me) redirecting output, GEN 4-3 defined, GEN 5-44 Bourne, S.R. Blau, R., & Joyce, J. introducing the UNIX shell, GEN Edit tutorial, GEN 3-3 to 3-23 Block device 4-3 to 4-27 Bourne, S.R., & Maranzano, J.F. description, SYS 5-20 Block map ADB debugging program, PGM 3-51 to 3-77 layout of blocks and fragments, SYS 1-27F Box (nroff/troff) creating smallest, GEN 5-68 Block of text box routine footnotes and, GEN 5-36 indenting from left and right, GEN 5-86E defined, PGM 4-81 Boxing description, GEN 5-69 index entries and, GEN 5-36 keeping together in text, GEN entering, GEN 5-8 to 5-9 bp command (me) See also pa command (me) 5-26 Block size specifying blank column, GEN selecting, SYS 5-41 5-35 Boldface specifying page break, GEN 5-23 entering, GEN 5-8 bp command (nroff/troff) See also ns command (nroff/troff) Bootstrap monitor loading, SYS 5-65 to 5-68 Bootstrap procedure booting from tape, defined, GEN 5-59 br command (me) SYS 5-22 description, SYS 5-22 to 5-31 details, SYS 5-59 to 5-64 messages about console bootstrap cassette, invoking, GEN 4-24 SYS 5-71 messages about the distributed console media, SYS 5-69 messages about the distributed system, SYS 5-70 Bootstrap program 4.2BSD improvement, SYS 5-15 loading, SYS 5-25 Bourne shell background command, GEN 4-3E changing prompt, GEN 4-6 command execution, GEN 4-23 to 4-24 command grammar, GEN 4-26 starting a line, GEN 5-24 br command (nroff/troff) defined, GEN 5-60 Braces argument expansion and, GEN 4-60E Braces (EQN) typesetting in proper size, GEN 5-100E Brackets (Bourne shell) matching any single character, GEN 4-34 Brackets (DC) placing character string on stack, GEN 2-58 Brackets (ed) appearing in character class, GEN 3-41 Index-7 Brackets (ed) (Cont.) Buffer (Cont.) ed and, GEN 3-25 deleting line numbers, GEN 3-41, writing part of, GEN 3-22 3-41KE Brackets (EQN) typesetting in proper size, GEN 5-100E Buffer (nroff/troff) flushing output buffer, GEN 5-73 Buffer (vi) description, GEN 3-54 Brackets (Mail) system commands and, GEN 3-68 beginning a line with, GEN 2-26 types of, GEN 3-62 Brackets (nroff/troff) creating, GEN 5-88KE BUFSIZ defined, PGM 1-21 creating large, GEN 5-68 BRANCH operator (C compiler) bugfiler program 4.2BSD improvement, SYS 1-19 defined, PGM 2-65 Break Built-in (M4) See Command (M4) defined, GEN 5-22 space and, GEN 5-23 specifying, GEN 5-24 break command (C shell) See also breaksw command (C shell) csh script and, GEN 4-58 defined, GEN 4-64 break statement (awk) defined, PGM 3-9 break statement (BC) forming, GEN 2-54 built-in command (C shell) defined, GEN 4-64 bx command (me) boxing words, GEN 5-37 defined, GEN 5-44 byte statement (as) defined, GEN 6-59 bzero library routine 4.2BSD improvement, SYS 1-14 C breaksw command (C shell) defined, GEN 4-64 exiting from switch statement, GEN 4-58 Broadcast message sending, SYS 3-27E Broadcast packet See also Broadcast message datagram sockets and, SYS 3-27 Broken bar shell and, GEN 2-27 BSS operator (C compiler) defined, PGM 2-64 bss segment (as) See also Assignment statement (as) See also Location counter (as) description, GEN 6-564 bss statement defined, GEN 6-59 bstring library 4.2BSD improvement, SYS 1-14 btlgammon game See backgammon game buf.h file 4.2BSD improvement, SYS 5-6 Buffer defined, GEN 3-4 Index-8 C argument (nroff) specifying, GEN 5-27 ¢ command (DC) descripton, GEN 2-58 ¢ command (ed) defined, GEN 3-34 using, GEN 3-31 to 3-32 ¢ command (edit) description, GEN 3-18 ¢ command (ex) description, GEN 3-88 C command (me) defined, GEN 5-46 ¢ command (me) centering blocks of text, GEN 5-27 defined, GEN 5-43, 5-46 specifying a chapter without number, GEN 5-33 specifying chapters, GEN 5-33 ¢ command (sed) defined, GEN 3-109 C command (vi) defined, GEN 3-78 C compiler description, PGM 2-63 to 2-77 as programming tool, GEN 2-15 C compiler (Cont.) replacing, SYS 5-118 C shell (Cont.) introduction, GEN 4-29 to 4-74 logging in, GEN 4-39 c escape (Mail) metacharacters and, GEN 4-32 description, GEN 2-25 overwriting files and, GEN 4-41 C flag (lint) purpose of, GEN 4-29 creating libraries from C source using from the terminal, GEN code, SYS 1-7 c flag (mkey) specifying file of common words, 4-30 to 4-38 C shell variables description, GEN 4-40 to 4-41 GEN 5-147 set command and, GEN 4-40E C library reinstalling, SYS 5-56E ¢ macro (me) defined, GEN 5-46 c¢2 command (nroff/troff) defined, GEN 5-67 'CAI script, GEN 6-9E to 6-11E ¢ number register (nroff/troff) description, GEN 6-6 to 6-7 prerequisites, GEN 6-6 defined, GEN 5-81 prerequisites for the writer, GEN ¢ operator (vi) defined, GEN 3-80 6-8 types of, GEN 6-7 C option (hunt) defined, GEN 5-148 Campbell, R. line printer spooling system C option (tar) (4.2BSD), PGM 4-99 to 4-105 forcing chdir operations in an operation, SYS 1-9 CANBSIZ parameter description, SYS 5-121 ¢ option (uucp) defined, SYS 5-132 C preprocessor canfield game ‘See also cfscores program 4.2BSD improvement, SYS 1-17 if statements and, SYS 1-5 line numbers and, SYS 1-5 Carbon copy See CC: list C program debugging, PGM 3-53 to 3-58 Caret See Circumflex character (ed) C programming language See also M4 macro processor case branch CAI script for, GEN 6-7 description, GEN 4-8 to 4-9 command line format, PGM 1-3 form of, GEN 4-8E computers supporting, GEN 2-15 case command (C shell) defined, GEN 4-64 programming in, GEN 2-14 to 2-15 cat command (C shell) collecting files, PGM 1-5E reference manual, PGM 2-5 to combining files, GEN 3-48, 3-48E 2-35 supporting programs, GEN 2-15 C Programming Language Reference Manual, The, PGM 2-5 to 2-35 See also C programming language C shell 4.2BSD improvement, SYS 1-5 built-in commands, GEN 4-50 to 4-52 compared to other command interpreters, GEN 4-30 defined, GEN 4-29 details for terminal users, GEN 4-39 to 4-52 history list and, GEN 4-41 interrupts and, GEN 4-36 defined, GEN 4-64 listing system users, GEN 4-35E printing files, GEN 2-7 printing merged files, GEN 2-11 printing pipeline information, GEN 2-11 terminating, GEN 4-36 cat program See cat command (C shell) CBRANCH operator (C compiler) defined, PGM 2-66 cc dbx and, SYS 1-5 cc command (nroff/troff) defined, GEN 5-67 Index-9 CC: list See also askcc option adding people to, GEN 2-25 cctab table defined, PGM 2-68 cd command (C shell) See also pushd command (C shell) changing working directory, GEN 2-10 chase game obsolete, SYS 1-17 chdir command (C shell) See cd command (C shell) Cherry, L., & Morris, R. BC and, GEN 2-43 to 2-55 DC and, GEN 2-57 to 2-64 Cherry, L.L., & Kernighan, B.W. typesetting mathematics, GEN defined, GEN 4-64 description, GEN 2-29 5-97 to 5-104 Typesetting Mathematics — User’s Guide, GEN 5-105 to 5-114 working directory and, GEN 4-48 ce command (me) entering, GEN 5-24 Cherry, L.L., & Vesterman, W. style and diction programs, GEN ce command (nroff/troff) defined, GEN 5-61 Cedilla See Metacharacters Centering blocks of text, GEN 5-27, 5-61 specifying, GEN 5-24 ch command (nroff/troff) defined, GEN 5-65 Change bars (nroff/troff) specifying, GEN 5-72 change command (ed) See ¢ command (ed) change command (edit) See ¢ command (edit) change command (ex) See ¢ command (ex) change directory command See ¢d command (C shell) Changequote command (M4) description, PGM 2-395E Chapter formatting, GEN 5-33 inserting in table of contents automatically, GEN 5-46 specifying page numbers, GEN 5-46 specifying without number, GEN 5-33 Chapter-oriented document formatting, GEN 5-34F Character class circumflex within, GEN 3-42 defined, GEN 3-41 forming, GEN 3-33E lowercase letters and, GEN 3-41 number ranges and, GEN 3-41 special characters and, GEN 3-41 specifying exceptions, GEN 3-42 uppercase letters and, GEN 3-41 Index-10 5-163 to 5-177 chfn 4.2BSD improvement, SYS 1-5 chgrp 4.2BSD improvement, SYS 1-5 ching game 4.2BSD improvement, SYS 1-17 chmod command (Bourne shell) making a file executable, GEN 4-TE marking executable files, GEN 2-12 chsh command (C shell) defined, GEN 4-64 CHSHR file incoming mail and, GEN 2-17 chshrec file putting into effect before next login, GEN 4-51 Circle See Metacharacters Circumflex (edit) searching and, GEN 3-20 Circumflex character (ed) at beginning of line and, GEN 3-40 meaning, GEN 3-33 uses, GEN 3-40 Circumflex character (me) See Metacharacters clear routine defined, PGM 4-81 clearok routine defined, PGM 4-81 Client process See also Server process description, SYS 3-19 Clist segment setting number, SYS 5-122 close function description, PGM 1-11 clrtoeol routine defined, PGM 4-81 cmp program defined, GEN 4-64 co command (edit) description, GEN 3-15 co command (ex) description, GEN 3-88 Code generation (C compiler) description, PGM 2-68 to 2-76 matching table entries against trees, PGM 2-69 Column specifying, GEN 5-43 specifying headers for continuing pages, GEN 5-42 Command (DC) for human use (Cont.) reference list, GEN 2-57 to 2-59 how they work, GEN 2-57 Command (ed) See also specific commands description, GEN 3-25 reference list, GEN 3-34 Command (ex) See also specific commands addressing primitives, GEN 3-87 combining addressing primitives, GEN 3-87 exceeding thresholds, GEN 3-86 reference list, GEN 3-87 to 3-96 structure of, GEN 3-86 syntax, GEN 3-87E Command (M4) specifying headers for continuing See also specific commands pages with a macro, GEN reference list, PGM 2-398 5-75E Command (Mail) specifying in text file, GEN 5-6 See also specific commands starting, GEN 5-35 reference list, GEN 2-28 to 2-33, text formatting commands for double columns, GEN 5-15E, 5-35 Comma character (ed) compared with semicolon, GEN 3-45 COMMA operator (C compiler) defined, PGM 2-66 Command (Bourne shell) See also specific commands 2-39T Command (make) defined, PGM 3-16 Command (nroff) description, GEN 5-22 to 5-25 Command (nroff/troff) See also specific commands reference list, GEN 5-51 Command (vi) - See also specific commands grammar, GEN 4-26 case and, GEN 3-59 grouping, GEN 4-14 ex 3.5 changes and, GEN 3-103 Command (C shell) See also Program See also specific commands defined, GEN 4-64 reference list, GEN 4-63 to 4-74 regenerating, SYS 5-118 repeating, GEN 4-41 to 4-43, 4-51E substituting output for, GEN 4-61E suspending temporarily, GEN 4-36 terminating, GEN 4-35 to 4-38 typing, GEN 2-4 within quotation marks, GEN 4-60 Command (DC) See also specific commands for file manipulation, GEN 3-71T preceding counts and, GEN 3-70 Command file description, GEN 1-29 Commmand line running two programs with one, GEN 2-11 Command line flag (Mail) See Flag (Mail) Command mode (ex) defined, GEN 3-85 Command name defined, GEN 4-64 Command procedure See Shell procedure Command substitution See also Modifier (C shell) defined, GEN 4-65 for human use Index-11 Command-list defined, GEN 4-8 grouping commands, GEN 4-14 Comment (awk) defined, PGM 3-9 Comment (BC) convention, GEN 2-49, 2-50 Comment (ex) description, GEN 3-86 Comment (nroff/troff) specifying, GEN 5-67 Communication domain defined, SYS 3-6 Component Configuration file (Cont.) specifying multiple bootable images, SYS 5-80 syntax, SYS 5-79 to 5-83 VAX-11/780 sample, SYS 5-84 to 5-87 connect system call datagram sockets and, SYS 3-10 errors, SYS 3-8 establishing connection between sockets, SYS 1-10 initiating connection, SYS 3-8E Connect time accounting summarizing, SYS 5-56 defined, GEN 4-65 Compound statement (BC) forming, GEN 2-54 Connection Computer-aided instruction Constant (BC) accepting, SYS 3-9E receiving, SYS 3-8 to 3-9 See CAI scripts defined, GEN 2-50 comsat program Context search (ed) 4.2BSD improvement, SYS 1-19 CON operator (C compiler) defined, PGM 2-66 Conditional See if/fendif commands conf.c file 4.2BSD improvement, SYS 5-14 installing device driver and, SYS 5-119 conf.h file 4.2BSD improvement, SYS 5-6 config program 4.2BSD improvement, SYS 1-19 adding nonstandard system facilities, SYS 5-96 defined, SYS 5-73 description, SYS 5-73 to 5-105 SYS 5-99 to 5-100 device defaults, backslash character and, GEN 3-43 defined, GEN 3-35 methods, GEN 3-30 to 3-31 question mark character and, GEN 3-43 repeating a search, GEN 3-31 reverse order and, GEN 3-31 slashes and, GEN 3-39 Context search (edit) d command and, GEN 3-16 delete command and, GEN 3-16C move command and, GEN 3-15 repeating, GEN 3-20E reversing, GEN 3-20 s command and, GEN 3-20 continue command (C shell) defined, GEN 4-65 modifying system code, SYS 5-88 continue statement (awk) defined, PGM 3-9 modifying system configuration, Control character (C shell) files generated by, SYS 5-76 SYS 5-76 prerequisite information, SYS 5-74 defined, GEN 4-65 Control character (nroff/troff) changing, GEN 5-67 profiled systems and, SYS 5-78 commands and, GEN 5-56 specifying options items, SYS Control character (vi) in text file, GEN 3-61 Control statement (BC), GEN 2-47E description, GEN 2-47 to 2-48 Cooper, E., & others 4.2BSD System Manual, PGM 5-75 Configuration clause description, SYS 5-80 Configuration file contents, SYS 5-76 creating, SYS 5-76 grammar, SYS 5-97 to 5-98 specifying devices, SYS 5-81 Index-12 4-15 to 4-52 copy command (C shell) See cp command (C shell) copy command (edit) See co command (edit) copy command (ex) See co command (ex) copy command (Mail) See also save command (Mail) description, GEN 2-29 using, GEN 2-23E copy program loading, SYS 5-24E mini-root file system and, SYS 5-24 Core dump file CSPACE operator (C compiler) defined, PGM 2-64 css network driver 4.2BSD improvement, SYS 1-15 ctags 4.2BSD improvement, SYS 1-5 ctime library 4.2BSD improvement, SYS 1-14 CTRL-B defined, GEN 3-75 description, GEN 3-56 CTRL-C ULTRIX-32 and, GEN 2-1 CTRL-D See also CTRL-U defined, GEN 4-65 defined, GEN 3-75 program faults and, GEN 1-31 description, GEN 3-56 terminating a program and, GEN 4-37 Cover sheet entering in text file, GEN 5-5 formatting commands, GEN 5-5E cp command (C shell) 4.2BSD improvement, SYS 1-5 CTRL-E defined, GEN 3-75 description, GEN 3-56 CTRL-F defined, GEN 3-75 description, GEN 3-56 CTRL-G copying a file, GEN 2-7E, 3-47 defined, GEN 3-75 defined, GEN 4-65 vi and, GEN 3-57 saving a file, GEN 3-47E cpu type parameter (config) defined, SYS 5-79 CR key See RETURN key Crash recovering files after, GEN 3-22 creat function description, PGM 1-10 creat system call obsolete in 4.2BSD, SYS 1-10 cref program defined, GEN 2-13 crmode routine defined, PGM 4-84 crt option (Mail) paging mail, GEN 2-20 type command and, GEN 2-32 crt0.ex file 4.2BSD improvement, SYS 5-13 cs command (troff) defined, GEN 5-58 csh program See C shell cshre file defined, GEN 4-65 logging in and, GEN 4-39 CTRL-H See also At sign See also u command (edit) defined, GEN 3-75 deleting characters, GEN 3-7 CTRL-J defined, GEN 3-75 CTRL-L defined, GEN 3-75 CTRL-M defined, GEN 3-75 CTRL-N defined, GEN 3-75 CTRL-P defined, GEN 3-76 CTRL-R defined, GEN 3-76 CTRL-U See also CTRL-D defined, GEN 3-76 description, GEN 3-56 ULTRIX-32 and, GEN 2-1 CTRL-Y defined, GEN 3-76 description, GEN 3-56 CTRL-Z defined, GEN 3-76 Index-13 d flag (make) cu command (nroff) defined, PGM 3-17 defined, GEN 5-67 d operator (vi) cu program defined, GEN 3-80 See tip program d option (inv) Current line defined, GEN 5-147 printing, GEN 3-11E curses library | 4.2BSD improvement, SYS 1-14 Cursor motion optimization stand alone, PGM 4-78 to 4-80 Cursor positioning key terminals and, GEN 3-55 d option (uucico) defined, SYS 5-135 d option (uuclean) defined, SYS 5-137 d option (uucp) defined, SYS 5-131 DA command (ms) Cut mark specifying for troff, GEN 5-T4E specifying date on text page, GEN Cutting and pasting See cp command (ed) 5-9 da command (nroff/troff) See m command (ed) defined, GEN 5-65 See mv program (ed) Daisy wheel printer with ed, GEN 3-49 to 3-51 with UNIX commands, GEN 3-47 to 3-49 cwd variable (C shell) defined, GEN 4-65 working directory and, GEN 4-41 setting for 12-pitch, GEN 5-39 DARPA File Transfer Protocol server program See ftpd program DARPA Internet network architecture support, SYS Cylinder group description, SYS 1-26, 2-8 Czech | See Metacharacters 1-15 DARPA Internet protocol support, SYS 5-47 DARPA Request For Comments #833 ~ sendmail and, SYS 1-4 D DARPA Simple Mail Transfer Protocol d command (DC) descripton, GEN 2-58 d command (ed) defined, GEN 3-34 using, GEN 3-29 d command (edit) context search and, GEN 3-16 description, GEN 3-15 d command (ex) description, GEN 3-88 d command (me) defined, GEN 5-43 d command (sed) defined, GEN 3-108 D command (vi) defined, GEN 3-78 d escape (Mail) sendmail and, SYS 1-4 DARPA TELNET protocol See telnetd server program DARPA Trivial File Transfer Protocol See tftpd server program Dash specifying em dash, GEN 5-47 Data block kinds of, SYS 2-12 Data file defined, SYS 5-131 DATA operator (C compiler) defined, PGM 2-64 Data segment (as) description, GEN 6-54 data statement See also debug option defined, GEN 6-59 Data Translation A/D converter See ad driver debugging information and, GEN Datagram socket description, GEN 2-24 d flag (Mail) 2-36 Index-14 See also Raw socket Datagram socket (Cont.) delete command (edit) creating for on-machine use, SYS 3-7E defined, SYS 3-6 description, SYS 3-10 sending broadcast packets on See d command (edit) delete command (ex) See d command (ex) delete command (Mail) See also autoprint option (Mail) See also dt command (Mail) networks, SYS 3-27 Date See also undelete command specifying with -me, GEN 5-47 (Mail) specifying with -ms, GEN 5-9 abbreviating, GEN 2-20 date command (C shell) description, GEN 2-29 defined, GEN 4-65 keeping message from mbox, GEN using, GEN 2-4 dbx symbolic debugger description, 2-20E DELETE key SYS 1-4 defined, GEN 4-65 Pascal compiler pc and, SYS 1-8 description, GEN 3-55 DC program See also BC language ULTRIX-32 and, GEN 2-1 deleteln routine defined, GEN 2-57 description, GEN 2-57 to 2-64 defined, PGM 4-82 delivermail program internal arithmetic and, GEN 2-60 See sendmail program delwin routine programming, GEN 2-62 de command (nroff/troff) See also ig command (nroff/troff) defined, GEN 5-64 defining macros, GEN 5-89E defined, PGM 4-85 DES encryption algorithm chips and, SYS 4-11 Description file (make), PGM 3-14E See also -f flag (make) Dead.letter file, GEN 2-24 canceling mail and, GEN 2-18 description, PGM 3-15 to 3-16 Detached command debug option (Mail) See also -d flag defined, GEN 4-65 Device driver defined, GEN 2-34 converting local to 4.2BSD, SYS Debugging 5-4 defined, GEN 4-65 DecWriter IIT printer setting for serial lines, PGM - 4-101E Default defined, GEN 4-65 define command (M4) description, PGM 2-393 to 2-395 define keyword (BC), GEN 2-46E define program (EQN) description, GEN 5-100 define statement (BC) forming, GEN 2-55 delay routine description, PGM 2-76 Delayed text defined, GEN 5-28 delch routine defined, PGM 4-82 delete command (ed) See d command (ed) CSR value list, SYS 5-61 I/0 system and, PGM 4-67 to 4-73 installing new, SYS 5-119 prerquisites, SYS 5-89 Device name convention, SYS 5-19 devices.vax file 4.2BSD improvement, SYS 5-11 df reporting disk space in kilobytes, SYS 1-5 dh.c device driver 4.2BSD improvement, SYS 5-12 di command (nroff/troff) defined, GEN 5-64 diverting output to a macro, GEN 5-94 Diacritical marks available reference list, GEN 5-19 Index-15 Diacritical marks (Cont.) entering with EQN, GEN 5-100 Diagnostic defined, GEN 4-65 Diagnostic output dirs command (C shell) See also pwd command (C shell) compared with pwd, GEN 4-49 defined, GEN 4-66 saving name of previous directory, GEN 4-49 redirecting, GEN 4-44K Dial-up network Disk description, SYS 5-123 to 5-129 balancing load, SYS 5-39 operation, SYS 5-124 configuring load, SYS 5-37 to 5-43 processing, SYS 5-125 to 5-126 defined, GEN 3-4 protocol and, SYS 5-124, 5-126 security, SYS 5-125 dividing into partitions, SYS 5-38 formatting, SYS 5-22 to 5-24 starting your network, SYS 5-128 reporting space in kilobytes, SYS transmission speed, SYS 5-127 uses, SYS 5-126 Diction program See also Style program description, GEN 5-163 to 5-177 diff utility comparing files, GEN 2-13 1-5 reporting usage in kilobytes, SYS 1-5 space limits, SYS 4-3 space per device, SYS 5-38, 5-39T Disk bandwith 4.2BSD improvement, SYS 1-3 Disk driver dir 4.2BSD improvement, SYS 1-16 dir.h file 4.2BSD improvement, SYS 5-6 directories command See dirs command (C shell) Directory See also Home directory UNIX implementation and, PGM 4-9 Disk partition description, SYS 5-19 sizes, SYS 5-38 Disk quota 4.2BSD improvement, SYS 1-18 See also Root directory disabling, SYS 2-4 See also Working directory enabling, SYS 2-4 allocating, SYS 1-33 alternate name for, GEN 2-10 enforcing, SYS 5-57 per filesystem, SYS 1-4 changing, GEN 2-10 per user, SYS 1-4 changing working directory, GEN recovering from over quota 2-10 condition, SYS 2-3 creating, GEN 2-10 restricting, SYS 1-35 defined, GEN 4-66, PGM 4-10 setting, SYS 2-4 description, GEN 1-21, 2-9 determining, GEN 2-10 listing basic, GEN 2-9 moving up one level, GEN 2-10E organization changes for 4.2BSD, SYS 5-4 types of, SYS 2-3 Disk quota system configuration requirement, SYS 5-57 description, SYS 2-3 to 2-5 establishing, SYS 2-4 project-related, GEN 4-48 history, SYS 2-5 removing, GEN 2-10E including, SYS 2-4K security of, SYS 4-4 programs, SYS 5-57 Directory data block defined, SYS 2-12 directory library 4.2BSD improvement, SYS 1-14 directory option (ex) description, GEN 3-98 Directory stack defined, GEN 4-66 Index-16 diskpart program 4.2BSD improvement, SYS 1-19 disktab file 4.2BSD improvement, SYS 1-16 Display (nroff) defined, GEN 5-25, 5-42 description, GEN 5-25 to 5-27 specifying in fill mode, GEN 5-26 Display (nroff) (Cont.) text formatting commands for, GEN 5-15E distrib routine description, PGM 2-68 Distribution tape constructing, SYS 5-59 to 5-61 contents, SYS 5-59T Diversion (troff) description, GEN 5-94 divert command (M4) description, PGM 2-396 Division DC and, GEN 2-61 divhum command (M4) description, PGM 2-396 DL-11W See kg driver dmc network interface driver 4.2BSD improvement, SYS 1-15 DMC-11/DMR-11 point-to-point communications device See dmc network interface driver dmf.c device driver 4.2BSD improvement, SYS 5-12 dnl command (M4) description, PGM 2-397 Document preparation description, GEN 2-12 to 2-14 hints, GEN 2-13 to 2-14 reading list, GEN 2-16 DOD Standard TCP/IP network communication protocols support for, SYS 1-3 Dollar sign character (ed) Dot character (ed) determining value, GEN 3-29E equal sign and, GEN 3-35 line number defaults and, GEN 3-44 to 3-45 meaning, GEN 3-38, 3-39 meaning for context searching, GEN 3-33 p command and, GEN 3-28 printing, GEN 3-39 s command and, GEN 3-29 setting with semicolon, GEN 3-45 to 3-46 using, GEN 3-28, 3-33 Dot character (edit) equal sign and, GEN 3-17 uses, GEN 3-17 Dot character (nroff/troff) See Control character (nroff/troff) specifying lines of, GEN 5-88 dot option (Mail) See also ignoreof option defined, GEN 2-34 Doublespacing specifying, GEN 5-23 drtest program 4.2BSD improvement, SYS 1-19 DS command (ms) specifying line breaks, GEN 5-8 ds command (nroff/troff) defined, GEN 5-64 defining strings, GEN 5-89 DSTFLAG parameter description, SYS 5-122 dt command (Mail) end of line and, GEN 3-39 description, GEN 2-29 meaning, GEN 3-33, 3-40 dt command (nroff/troff) p command and, GEN 3-28 printing value, GEN 3-35 Dollar sign character (edit) equal sign and, GEN 3-17 printing last buffer line, GEN 3-17 searching and, GEN 3-20 domain.h file 4.2BSD improvement, SYS 5-5 don’t command (sed) defined, GEN 3-113 Dot character (C shell) at beginning of file, GEN 4-34 defined, GEN 4-63 separating filename components, GEN 4-33 defined, GEN 5-65 du command (C shell) defined, GEN 4-66 reporting disk usage in kilobytes, SYS 1-5 du program See du command (C shell) dump program See also rdump program 4.2BSD improvement, SYS 1-16, 1-19 using, SYS 5-53 dumpdef command (M4) description, PGM 2-397 dumpfs program 4.2BSD improvement, SYS 1-19 Index-17 Dungeons of doom See Rogue game Dynamic string storage allocator See Allocator ed line editor (Cont.) copying lines, GEN 3-51 creating text, GEN 3-25 deleting text, GEN 3-29 description, GEN 2-6 E escaping to use UNIX command, e command (ed) defined, GEN 3-34 global commands, GEN 3-32 GEN 3-51 using, GEN 3-27, 3-49E e command (edit) copying a file, GEN 3-14 r option and, GEN 3-23 u command and, GEN 3-16 e command (ex) description, GEN 3-88 E command (vi) defined, GEN 3-79 e command (vi) defined, GEN 3-80 e escape (Mail) description, GEN 2-24 e flag (sed) defined, GEN 3-106 e modifier (C shell) extracting filename extension, GEN 4-57E e option (nroff) defined, GEN 5-50 ec command (nroff/troff) defined, GEN 5-66 ec network interface driver 4.2BSD improvement, SYS 1-15 echo command (C shell) defined, GEN 4-66 echo routine defined, PGM 4-84 ed line editor See also edit line editor See also ex line editor accessing, GEN 3-25 adding text, GEN 3-25 addressing lines, GEN 3-43 to 3-46 advanced editing, GEN 3-37 to 3-52 backslash character and, GEN 3-33 breaking lines, GEN 3-42 CAI script for, GEN 6-7 changing text, GEN 3-31 to 3-32 command summary, GEN 3-34 context searching, GEN 3-30 to 3-31 Index-18 inserting text, GEN 3-31 to 3-32 interrupting, GEN 3-46 introduction, GEN 3-25 to 3-35 joining lines, GEN 3-42 line number defaults, GEN 3-44 to 3-45 marking a line, GEN 3-50 moving text, GEN 3-32, 3-50 printing a file, GEN 2-7 printing lines, GEN 3-27 reading a file, GEN 3-27 rearranging a line, GEN 3-43 repeating searches, GEN 3-44 searching for first occurrence of text string, GEN 3-46 sed and, GEN 3-105 setting dot, GEN 3-45 to 3-46 specifying lines with text patterns, GEN 3-46 to 3-47 specifying the second occurrence of text string, GEN 3-46 substituting text, GEN 3-29 supporting tools, GEN 3-51 to 3-52 using special characters, GEN 3-33 writing a file, GEN 3-26 ed.hup file saving text, GEN 2-6 edcompatible option (ex) description, GEN 3-98 edit command (ed) See e command (ed) edit command (edit) See e command edit command (ex) See e command (ex) edit command (Mail) See also visual command (Mail) description, GEN 2-29 edit line editor See also ed line editor See also ex line editor accessing, GEN 3-5 to 3-6 adding text, GEN 3-9 correcting text, GEN 3-9 edit line editor (Cont.) else command (Mail) current line and, GEN 3-11 See also if/endif commands (Mail) defined, GEN 3-3 description, GEN 2-30 entering text, GEN 3-6 ex editor and, GEN 3-23 finding a line, GEN 3-11E issuing UNIX command from, GEN 3-21 messages, GEN 3-6 moving around in the buffer, GEN 3-17 opening a file, GEN 3-9E, 3-14E prerequisites, GEN 3-3 printing current line number, GEN 3-11 printing nonprinting characters, else statement (awk) defined, PGM 3-9 Elz, R. disk quota system, SYS 2-3 to 2-5 em | defined, GEN 5-86 em command (nroff/troff) defined, GEN 5-65 Em dash in nroff/troff output, GEN 5-19 Emphasis See Boldface See Italic GEN 3-10 See Overstriking quitting, GEN 3-8 See Underlining reversing last command, GEN 3-16 saving modified text, GEN 3-13 searching for characters, GEN 3-10, 3-10E tutorial, GEN 3-3 to 3-23 Editing hints for, GEN 2-13 Editor See ed editor See edit editor See ex editor See Screen editor en network interface driver 4.2BSD improvement, SYS 1-16 enable/disable command (Ipc) description, PGM 4-103 endif command (C shell) See if/fendif commands (C shell) endif command (Mail) endif statement (as) See if/endif statement (as) endwin routine defined, PGM 4-85 Entry file | See sed stream editor defined, GEN 5-145 See vi screen editor Environment (C shell) EDITOR option (Mail) displaying, GEN 4-51E defined, GEN 2-33 Environment (nroff/troff) setting, GEN 2-33 specifying an editor, GEN 2-24 edquota program 4.2BSD improvement, SYS 1-19 ef command (me) defined, GEN 5-41 efftab table defined, PGM 2-68 EFL programming language description, PGM 2-123 to 2-157 eh command (me) defined, GEN 5-41 el command (nroff/troff) defined, GEN 5-71 else command (C shell) See also if/endif commands (C shell) See also then command (C shell) defined, GEN 4-66 | See if/endif commands (Mail) description, GEN 5-71, 5-94 eo command (nroff/troff) ‘defined, GEN 5-66 EOF (End of File) defined, GEN 2-5, 4-66 EOF operator (C compiler) defined, PGM 2-64 EOF value defined, PGM 1-21 description, PGM 1-4 ep command (me) defined, GEN 5-42 EQ command (EQN) specifying continuation, GEN 5-35 specifying equations, GEN 5-34 supplementing with troff commands, GEN 5-101 EQ command (me) defined, GEN 5-45 Index-19 EQ command (ms) specifying equations, GEN 5-10 Escape character(C shell) defined, GEN 4-66 See also NEQN program escape command See ! command (ed) CAI script for, GEN 6-7 ESCAPE key EQN program connecting output to troff, GEN 5-101 deficiencies, GEN 5-102 defined, GEN 5-1056 description, GEN 5-33, 5-97 to 5-104 forcing extra white space, GEN 5-99 formatting mathematics, GEN 2-13 grammar, GEN 5-101 language design, GEN 5-98 language theory, GEN 5-101 quoting an input string, GEN 5-100 Equal sign (ed) dot character and, GEN 3-35 description, GEN 3-55 escape option (Mail) changing escape character, GEN 2—-26 defined, GEN 2-34 Escape sequence (nroff/troff) reference list, GEN 5-54 ev command (nroff/troff) changing environment, GEN 5-94 description, GEN 5-72 eval command (M4) description, PGM 2-396 Evans and Sutherland Picture | System 2 See ps.c device driver EVEN operator (C compiler) defined, PGM 2-64 formatting, GEN 5-33 numbering, GEN 5-34 even statement (as) defined, GEN 6-59 ex command (ex) See e command (ex) setting with -ms, GEN 5-10 text formatting commands for, GEN 5-16E ex line editor Equation continuing, GEN 5-35E Erase character See also Backspace character default, GEN 4-30 erase routine defined, PGM 4-82 errno cell description, PGM 1-12 errno.h file 4.2BSD improvement, SYS 5-5 error troff messages and, SYS 1-5 error bells option (ex) description, GEN 3-98 Error condition (fsck) conventions, SYS 2-14 Error log file examining, SYS 5-53 Error message (ed) description, GEN 3-26 errprint command (M4) description, PGM 2-397 Escape character (Mail) changing, GEN 2-26 Escape character (nroff/troff) description, GEN 5-66 Index-20 ex command (nroff/troff) defined, GEN 5-72 See also ed line editor See also edit line editor See also sed stream editor See also vi screen editor 3.5 changes, GEN 3-102 command line format, GEN 3-83 editing modes, GEN 3-85 encryption code and, GEN 3-102 entering multiple commands on a line, GEN 3-86 errors and, GEN 3-85 file manipulation, GEN 3-84 to 3-85 limitations, GEN 3-101 printing current line number, GEN 3-95 printing version number, GEN 3-94 recovering from crash, GEN 3-85 recovering work, GEN 3-85K reference manual, GEN 3-83 to 3-104 starting, GEN 3-83 vi and, GEN 3-73 Ex Reference Manual, GEN 3-83 to Expression (C shell) 3-104 See also ex line editor Examples | entering with troff, GEN 5-89 Exception word list (nroff/troff) specifying, GEN 5-69 evaluating, GEN 4-55 Expression operator (as) reference list, GEN 6-57 Expression statement (as) defined, GEN 6-55 Expression statement (BC) Exclamation mark (C shell) using in command arguments, description, GEN 2-54 Extended Fortran Language GEN 4-35 Exclamation mark character (ed) See EFL programming language Extension shell command and, GEN 3-35 defined, GEN 4-67 Exclamation mark character (edit) External security code shell command and, GEN 3-21 Exclusive lock password security and, SYS 4-12 eyacc process and, SYS 1-3 4,.2BSD improvement, SYS 1-5 execl function See also execv F See also fork function description, PGM 1-13 Execute file defined, SYS 5-133 to 5-134 execv routin description, PGM 1-13 exit command (C shell) defined, GEN 4-66 exit command (Mail) description, GEN 2-30 exit function error handling and, PGM 1-8 exit statement (awk) defined, PGM 3-9 exit status defined, GEN 4-66 exp function (awk) defined, PGM 3-8 Expansion defined, GEN 4-67 Exponentiation F argument (nroff) specifying fill mode, GEN 5-26 f command (ed) defined, GEN 3-34 determining the filename, GEN 3-49 f command (edit) description, GEN 3-21 f command (ex) description, GEN 3-89 f command (me) defined, GEN 5-43 entering, GEN 5-28 f command (troff) mixing fonts within a line, GEN 5-86 5-86 F command (vi) DC and, GEN 2-61 defined, GEN 3-79 using, GEN 3-61 description, GEN 2-52 defined, PGM 2-65 Expression defined, GEN 4-67 Expression (as) defined, GEN 6-56 types of reference list, GEN 6-57 Expression (BC) See also Primitive expression defined, GEN 2-50 to 2-53 length, GEN 2-51 | mixing fonts within a word, GEN Exponentiation operator EXPR operator (C compiler) ~ renaming a file, GEN 3-49E f command (vi) defined, GEN 3-80 using, GEN 3-61 f flag (Mail) defined, GEN 2-36 reading mail from specified file, GEN 2-21 f flag (make) defined, PGM 3-17 f flag (mkey) reading file list, GEN 5-147 f flag (sed) defined, GEN 3-106 Index-21 feof macro f flag (su) breakpoints and, PGM 1-21 fast su and, SYS 1-9 ferror macro f macro (me) breakpoints and, PGM 1-21 defined, GEN 5-42 fflush function F option (hunt) ‘description, PGM 1-8 defined, GEN 5-148 fg command (C shell) f option (troff) defined, GEN 4-67 defined, GEN 5-50 f77 1/0 library u running background job in foreground, GEN 4-47E 4.2BSD improvement, SYS 1-6 description, PGM 2-79 to 2-88 running suspended job in foreground, GEN 4-47 error messages, PGM 2-85 to 2-87 exceptions to ANSI standard, PGM 2-88 Fabry, R., & others 4.2BSD System Manual, PGM 4-15 to 4-52 Fabry, R.S., & others 4.2BSD Interprocess Communication Primer, SYS 3-5 to 3-28 fast file system, SYS 1-23 to 1-38 networking implementation notes, SYS 3-29 to 3-57 fgets function description, PGM 1-8 fgrep hunt program and, GEN 5-148 fi command (nroff/troff) defined, GEN 5-61 Field (awk) description, PGM 3-8 Field (nroff/troff) defined, GEN 5-66 Figure specifying blank page for, GEN 5-44 factor program 4.2BSD improvement, SYS 1-17 fastboot script See also fasthalt script 4.2BSD improvement, SYS 1-19C fasthalt script specifying ruling for, GEN 5-45 specifying space for, GEN 5-44 FILE defined, PGM 1-21 File See also fastboot script See also File system 4.2BSD improvement, SYS 1-19 See also specific files fc command (nroff/troff) defined, GEN 5-66 fchmod system call 4.2BSD improvement fchmod, SYS 1-10 fchown system call 4.2BSD improvement, SYS 1-10 fclose function description, PGM 1-7 fentl system call 4.2BSD improvement, SYS 1-10 FCON operator (C compiler) defined, PGM 2-66 fed font editor value of, SYS 1-6 Feldman, S.I. EFL programming language, PGM 2-123 to 2-157 Make program, PGM 3-13 to 3-21 Feldman, S.I., & Weinberger, P.dJ. Fortran 77 compiler, PGM 2-89 to 2-109 Index-22 advisory locking and, SYS 1-3 appending, GEN 3-48 appending contents to mail, GEN 2-24 | arranging, GEN 2-10 CAI script for, GEN 6-7 combining, GEN 2-10, 3-48, 3-49 comparing, GEN 2-13 copying, GEN 2-TE, 3-47 copying from other directories, GEN 2-9 creating, GEN 2-6 defined, GEN 2-6, 3-3, PGM 4-10 description, GEN 1-20 displaying, GEN 2-10 handling multiple, GEN 2-8 - I/O device and, GEN 1-21 marking executable, GEN 2-12 merging multiple, GEN 2-14E open limit, PGM 1-11 opening with edit, GEN 3-14 optimal size, SYS 1-28 File (Cont.) file command (ex) paging, GEN 2-7 See f command (ex) printing, GEN 2-7 file command (Mail) printing from other directories, See folder command (Mail) GEN 2-9 File descriptor printing merged, GEN 2-11 changing assignments, GEN 1-28 printing multiple, GEN 2-7, 2-8, description, PGM 1-8 2-11 File locking printing on high-speed printer, description, SYS 1-33 GEN 2-7 File pointer programs executed by the shell and, GEN 1-27 defined, PGM 1-5 File system protection information, SYS 4-3 recovering with edit, GEN 3-22 accessing directories on old and new systems, SYS 1-33 removing, GEN 3-48 block size, SYS 2-8 removing multiple from directory, checking structural integrity, SYS GEN 2-10E 2-10 renaming, GEN 2-7 data structure, PGM 4-12F replacing the terminal, GEN 2-10 defined, PGM 4-10 to 4-13 sending to several people, GEN 2-11 description, GEN 1-20 to 1-24 fixing corrupted, SYS 2-10 to 2-13 size of, GEN 1-23, 2-13 fragmentation of, splitting, GEN 2-13 implementation, PGM 4-11 truncating to specific length, SYS 1-4 viewing in other directories, GEN 2-9 SYS 2-9 implementing, GEN 1-24 to 1-26 overview, SYS 2-8 to 2-9 protecting, GEN 1-22 removable volume and, GEN 1-22 writing part of, GEN 3-49 updating, SYS 2-9 writing to disk, GEN 3-8 File system (4.2BSD) File (C shell) See also specific files accessing from other directories, GEN 4-34 directing input from, GEN 4-32E to 4-33K inputting to, GEN 4-31 maintaining related, GEN 4-53 outputting from, GEN 4-31 redirecting terminal output to, GEN 4-31E terminating a command, GEN 4-36E File (line printer system) reference list, PGM 4-99 File (M4) manipulating, PGM 2-396 File (vi) See also File system (Bell) allocating data blocks, SYS 1-30 allocating directories, SYS 1-30 allocating new blocks, SYS 1-29 allocation strategy, SYS 1-30 block size, SYS 1-26 block size and wasted space, SYS 1-27T compared to previous file system, SYS 1-23 to 1-38 creating file versions, SYS 1-35 fragments and, SYS 1-27 free blocks and, SYS 1-28 hardware parameters and, SYS 1-28 to 1-29 implementing layout, SYS 5-42 layout policies, SYS 1-29 to 1-30 locking files, SYS 1-33 quitting, GEN 3-63 moving, SYS 5-54 recovering, GEN 3-66 optimizing storage, SYS 1-26 writing, GEN 3-63 organization, SYS 1-26 to 1-30 file command symbolic links and, SYS 1-6 file command (edit) See f command (edit) performance, SYS 1-31 to 1-32 quotas and, SYS 2-4 reading rates, SYS 1-31T restricting quota, SYS 1-35 Index-23 File system (4.2BSD) (Cont.) selecting parameters, SYS 5-40 to 5-41 software engineering, SYS 1-36 find finding symbolic links, SYS 1-6 Find key defined, GEN 5-144 space overhead, SYS 1-28 First page writing rates, SYS 1-31T entering in text file, GEN 5-5 fl command (nroff/troff) File system (Bell) description, SYS 1-25 File System Check Program See fsck program file.h file 4.2BSD improvement, SYS 5-6 Filelist file creating, GEN 2-10 Filename 4.2BSD changes, SYS 5-4 arbitrary length and, SYS 1-3 changing, GEN 3-47, 3-4TW restriction, GEN 3-47 conventions for, GEN 2-8 description, GEN 1-21 edit editor and, GEN 3-21 folder name and, GEN 2-23 maximum length, SYS 1-33 renaming in same file system, SYS 1-4 defined, GEN 5-73 Flag (C shell) purpose of, GEN 4-31 Flag (ex) description, GEN 3-86 Flag (Mail) reference list, GEN 2-41T Flag option (C shell) defined, GEN 4-67 Flag option (Mail) defined, GEN 2-38 flags field (config description, SYS 5-82 Floating keep, GEN 5-26F defined, GEN 5-26 flock system call 4.2BSD improvement, SYS 1-10 fmt command formatting outgoing mail, GEN 2-26 specifying, GEN 3-8 suggestions, GEN 2-7 Filename (C shell) base part and, GEN 4-63 characters in, GEN 4-33 defined, GEN 4-67 fo command (me) defined, GEN 5-41 entering, GEN 5-23 Foderaro, J.K., & others Franz Lisp Manual, The, PGM 2-211 to 2-358 Filename expansion defined, GEN 4-67 FILENAME variable (awk) determining current input file, PGM 3-6 files file 4.2BSD improvement, SYS 5-11 Folder specifying for file, GEN 2-23 folder command (Mail) See also folders command (Mail) description, GEN 2-30 directing Mail to a folder, GEN 2-23 adding device driver and, SYS 5-89 files.vax file 4.2BSD improvement, SYS 5-11 Fill mode specifying, GEN 5-26 Filling (nroff/troff) description, GEN 5-60 to 5-61 filsys.h file See fs.h file Filter Folder directory specifying, GEN 2-23 Folder facility description, GEN 2-23 folder option (Mail) defined, GEN 2-34 Folders maintaining, GEN 2-23 folders command (Mail) See also folder command (Mail) calling, PGM 4-103E description, GEN 2-30 creating for printers, PGM 4-102 listing folder set, GEN 2-23 defined, GEN 4-4 description, GEN 1-28 Index-24 Font changing, GEN 5-58, 5-86 Font (Cont.) command list, GEN 5-51 default, GEN 5-58 defined, GEN 5-36 description, GEN 5-36 to 5-37 mixing within a line, GEN 5-86 mixing within a word, GEN 5-37, 5-86 setting, GEN 5-39 foreach command (C shell) (Cont.) performing similar commands, GEN 4-60E Foreground defined, GEN 4-67 Foreground job continuing, GEN 4-46 description, GEN 4-45 to 4-48 suspending, GEN 4-46 specifying, GEN 5-44, 5-85 fork function specifying for a word, GEN 5-36E specifying for more than one word, Form feed character GEN 5-36 style examples, GEN 5-78T switching, GEN 5-36 Font library installing, SYS 5-31 Footer See also Header formatting, GEN 5-41 to 5-42 specifying, GEN 5-23 Footnote See also Delayed text entering, GEN 5-8, 5-28, 5-43 entering with a macro, GEN 5-76E numbered automatically, GEN 5-17 resetting the numbering, GEN 5-46 separating footnotes, GEN 5-43 specifying point size, GEN 5-8 text formatting commands for, GEN 5-15E fopen function description, PGM 1-14 printing, GEN 3-37 Form letter using with nroff/troff, GEN 5-72 format program 4.2BSD improvement, SYS 1-18, 1-19, 5-15 formatting disks, SYS 5-22 to 5-24 loading, SYS 5-23 Fortran See 77 1/0 library See Fortran 77 See Ratfor language Fortran 77 C and, GEN 2-15 running old programs, PGM 2-83 Fortran 77 compiler 4.2BSD improvement, SYS 1-4 description, PGM 2-89 to 2-109 Fortran 1I/0 See also 77 1/0 library constraints, PGM 2-80 to 2-82 execution, PGM 2-80 See also fclose function forms of, PGM 2-79 to 2-80 See also open function general concepts, PGM 2-79 to calling, PGM 1-5E description, PGM 1-5 for loop description, GEN 4-7 form, GEN 4-8E for statement (awk) defined, PGM 3-9 for statement (BC) 2-80 logical units and, PGM 2-80 unit numbers and, PGM 2-80 fortune game 4.2BSD improvement, SYS 1-17 Forward slash searching for, GEN 3-39 fp command forming, GEN 2-54 specifying fonts on the typesetter, process, GEN 2-47 GEN 5-86 fp compiler/interpreter Functional Programming language writing, GEN 2-47 For system call description, GEN 1-26 foreach command (C shell), GEN 4-56E defined, GEN 4-67 exiting loop, GEN 4-58 and, SYS 1-6 FP programming language description, PGM 2-359 to 2-391 fpr program printing Fortran files, SYS 1-6 Index-25 fprintf function fsync system call description, PGM 1-7 Fraction 4.2BSD improvement, SYS 1-11 ft command (troff) setting with troff, GEN 5-86K defined, GEN 5-59 specifying with EQN, GEN 5-99 specifying fonts, GEN 5-86 Fragment size FTP server selecting, SYS 5-41 frame.h file description, SYS 5-50 ftp server program 4.2BSD improvement, SYS 5-13 ARPA file transfer protocol and, Franz Lisp Manual, The, PGM SYS 1-6 ftpd server program 2-211 to 2-358 See also Franz Lisp system Franz Lisp system 4.2BSD improvement, SYS 1-19 ftpusers file user manual, PGM 2-211 to 2-358 from command (Mail) description, SYS 5-50 ftruncate system call 4.2BSD improvement, SYS 1-11 description, GEN 2-30 message lists and, GEN 2-28 Function (BC) from keyword (EQN), GEN 5-100E description, GEN 2-45 to 2-46 Front matter number permitted, GEN 2-45 specifying, GEN 5-33 Function call 4,2BSD improvement, SYS 1-16 Function identifier fs defined, GEN 2-51 FS command (ms) description, GEN 2-50 specifying footnotes, GEN 5-8 fz command (nroff/troff) specifying font size, GEN 5-81 F'S variable (awk) defined, PGM 3-6 G fs.h file 4.2BSD improvement, SYS 5-5 fscanf function g command (ed) See also sscanf function defined, GEN 3-34 description, PGM 1-8 process, GEN 3-46 fsck program s command and, GEN 3-46E See also badsect program 4.2BSD improvement, SYS 1-19 checking connectivity, SYS 2-12 checking directory data blocks, SYS 2-12 s command restriction and, GEN 3-47 specifying line numbers, GEN 3-47 specifying lines with text patterns, checking free blocks, SYS 2-10 GEN 3-46 to 3-47 checking inode block count, SYS specifying more than one command, GEN 3-47 2-12 checking inode links, SYS 2-11 using, GEN 3-32 checking inode state, SYS 2-11 g command (edit) | checking super-block, SYS 2-10 description, GEN 3-19 description, SYS 2-7 to 2-25 p command and, GEN 3-19 error conditions, substitute command and, GEN SYS 2-14 to 2-25 rebuilding block allocation maps, SYS 2-11 fsplit program splitting multi-function Fortran files, SYS 1-6 fstab library 4.2BSD improvement, SYS 1-15 fstat system call 4.2BSD improvement, SYS 1-11 Index-26 3-19 uppercase letters and, GEN 3-19 using, GEN 3-19E g command (ex) description, GEN 3-89 G command (sed) defined, GEN 3-113 g command (sed) defined, GEN 3-113 G command (vi) gettimeofday system call (Cont.) defined, GEN 3-79 specifying value, SYS 5-74 finding text lines, GEN 3-57 g flag (sed) gettmode routine defined, PGM 4-88 defined, GEN 3-110 variables set by, PGM 4-90T g option (hunt) getty program See also gettytab file defined, GEN 5-148 4.2BSD improvement, SYS 1-18, g option (troff) defined, GEN 5-50 1-19 g option (uucp) gettytab file defined, SYS 5-132 4.2BSD improvement, SYS 1-16 getwd library gcore program creating a core dump of running process, SYS 1-6 getyx routine defined, PGM 4-85 genassym.c file 4.2BSD improvement, SYS 5-14 GID description, SYS 4-4 getc macro defined, PGM 1-6 getch routine 4.2BSD improvement, SYS 1-15 global command (ed) ' defined, PGM 4-84 getchar macro input and, PGM 1-4 getdtablesize system call 4.2BSD improvement, SYS 1-11 getgroups system call 4.2BSD improvement, SYS 1-11 gethostbynameandnet routine, SYS 3-13E gethostid system call 4.2BSD improvement, SYS 1-11 gethostname system call 4.2BSD improvement, SYS 1-11 getitimer system call 4.2BSD improvement, SYS 1-11 getpagesize system call 4.2BSD improvement, SYS 1-11 getpass library 4.2BSD improvement, SYS 1-14 getpriority system call 4.2BSD improvement, SYS 1-11 getrlimit system call 4.2BSD improvement, SYS 1-11 getservbyname routine specifying a protocl, SYS 3-14 getsockopt system call 4.2BSD improvement, SYS 1-11 getstr routine defined, PGM 4-84 gettable program 4.2BSD improvement, SYS 1-19 retrieving NIC host data base, SYS 5-48 gettimeofday system call 4.2BSD improvement, SYS 1-11 See g command (ed) See v command (ed) global command (edit) See g command (edit) global command (ex) See g command (ex) globl statement (as) defined go flag accessing sdb symbol information, SYS 1-5 goto command (C shell) defined, GEN 4-67 form of, GEN 4-58E gprof command profiled systems and, SYS 5-78 gprof program See also gprof.h file displaying execution time, SYS 1-6 gprof.h file 4.2BSD improvement, SYS 5-5 Graham, S.L., & others Berkeley Pascal User Manual, PGM 2-159 to 2-209 Grave accent See Metacharacters Greek letters setting with -ms, GEN 5-10 setting with troff, GEN 5-86E troff command list, GEN 5-96 grep command (C shell) defined, GEN 4-67 grep program finding lines with combinations of text patterns, GEN 3-51 Index-27 grep program (Cont.) finding lines without specified text, GEN 3-51E finding specified text in a set of Hard limit defined, SYS 2-3 Hard lock compared to advisory lock, SYS files, GEN 3-51, 3-51E nonalphabetic characters and, GEN 3-51 spell and, GEN 2-13 using, GEN 2-13E Grep program searching for text patterns, GEN 2-13 Group Identification Number See GID Group set description, SYS 1-3 1-33 Hardcopy terminal vi and, GEN 3-73 hardtabs option (ex) description, GEN 3-98 Hash character See Sharp character Hat See Circumflex character (ed) hec command (nroff/troff) defined, GEN 5-69 he command (me) grouping command (sed) defined, GEN 5-41 defined, GEN 3-113 entering, GEN 5-23 groups program display access list for user’s group, SYS 1-6 head command (C shell) defined, GEN 4-68 Header See also Footer formatting, GEN 5-41 to 5-42 H specifying, GEN 5-23 suppressing, GEN 2-36 H command (sed) defined, GEN 3-113 h command (sed) defined, GEN 3-113 h command (troff) moving text backwards on a line, GEN 5-87 specifying horizontal motion, GEN 5-68 H command (vi) defined, GEN 3-79 h escape (Mail) Header field defined, GEN 2-38 headers command (Mail) See also ignore command (Mail) abbreviating, GEN 2-30 description, GEN 2-30 help command (Mail) description, GEN 2-30 restriction, GEN 2-30 using, GEN 2-22 Henry, R.R., & Reiser, J.F. Berkeley VAX/UNIX Assembler description, GEN 2-25 Reference Manual, PGM 4-53 h flag (Mail) defined, GEN 2-36 H macro (me) specifying column heads on continuing pages, GEN 5-42 h macro (me) defined, GEN 5-42 h option (inv) defined, GEN 5-147 h option (nroff) defined, GEN 5-81 Haley, C.B., & others Berkeley Pascal User Manual, PGM 2-159 to 2-209 hangman game 4.2BSD improvement, SYS 1-17 Index-28 to 4-65 Here document description, GEN 4-9 to 4-10 Hexadecimal notation BC language and, GEN 2-44 hier 4.2BSD improvement, SYS 1-17 history command (C shell) defined, GEN 4-68 repeating previous commands, GEN 4-43 History list description, GEN 4-41 to 4-43 using, GEN 4-42E hl command (me) defined, GEN 5-45 hl command (me) (Cont.) hy network interface driver 4.2BSD improvement, SYS 1-16 figures and, GEN 5-26 hold command (Mail) Hyphen entering with text, GEN 5-22 See also preserve command (Mail) description, GEN 2-31 Hyphenation (nroff/troff) automatic, GEN 5-69 hold option (Mail) command list, GEN 5-52 defined, GEN 2-34 storing mail, GEN 2-20 Hyphenation indicator character specifying, GEN 5-69 Home directory defined, GEN 4-68 HZ parameter description, SYS 5-122 returning to, GEN 4-49 HOME variable (Bourne shell) description, GEN 4-11 I home variable (C shell) displaying your home directory, i command (DC) changing the base of input GEN 4-41 numbers, GEN 2-62 Horizonal line description, GEN 2-59 See Ruling Horton, M., & Joy, W. i command (ed) editing with vi, GEN 3-53 to 3-82 defined, GEN 3-34 Ex Reference Manual, GEN 3-83 using, GEN 3-31 to 3-32 i command (ex) to 3-104 description, GEN 3-89 Host name represented by hostent structure, defined, GEN 5-44 SYS 3-12K Hostent structure getting for host, SYS 3-13E specifying italic font, GEN 5-36 I command (ms) specifying italic, GEN 5-8 hostid program displaying system unique identifier, i command (me) SYS 1-6 hostname program setting host name, SYS 1-6 hosts database 4.2BSD improvement, SYS 1-16 hosts.equiv file description, SYS 5-49 hp.c device driver 4.2BSD improvement, SYS 5-14 htable program converting NIC host data base, SYS 5-48 hunt program defined, GEN 5-146 description, GEN 5-148 ferep and, GEN 5-148 options list, GEN 5-148 timing, GEN 5-149 hw command (nroff/troff) defined, GEN 5-69 hx command (me) defined, GEN 5-41 hy command (nroff/troff) defined, GEN 5-69 i command (sed) See also a command (sed) defined, GEN 3-109 I command (vi) defined, GEN 3-79 i command (vi) defined, GEN 3-81 description, GEN 3-58 i flag (Mail) See also ignore option defined, GEN 2-36 i flag (make) defined, PGM 3-17 i flag (mkey) ignoring lines, GEN 5-147 I option changed to -1, SYS 1-6 i option specifying directory search paths, SYS 1-6 i option (hunt) defined, GEN 5-148 i option (inv) defined, GEN 5-148 i option (nroff/troff) defined, GEN 5-49 Index-29 i-list if statement (BC) description, GEN 1-24 i-node defined, PGM 4-10 file description and, GEN 1-24 i-number defined, GEN 1-24 I/0 forming, GEN 2-54 restriction, GEN 2-47 writing, GEN 2-47 ifdef command (M4) description, PGM 2-395 ifelse command (M4) description, PGM 2-397 essentials of, GEN 1-23 to 1-24 I/0 request multiplexing among sockets and files, SYS 3-11 I/O system description, PGM 4-8 to 4-10 overview, PGM 4-67 to 4-73 ibase defined, GEN 2-44, 2-51 icheck program 4.2BSD improvement, SYS 1-19 ident parameter (config) defined, SYS 5-79 Identifier IFS variable defined, GEN 4-12 ig command (nroff/troff) defined, GEN 5-73 ignore command (Mail) description, GEN 2-31 ignore option (Mail) See also i1 flag (Mail) defined, GEN 2-34 ignorecase option (ex) description, GEN 3-98 ignoreeof variable (C shell) defined, GEN 4-68 setting, GEN 4-41E defined, GEN 2-51 ignoreof option (Mail) kinds of, GEN 2-50 See also dot option Identifier (as) defined, GEN 6-53 ie command (nroff/troff) defined, GEN 5-71 if command (Bourne shell) description, GEN 4-13 to 4-14 if command (C shell) See if/endif commands (C shell) if command (Mail) See if/endif commands (Mail) if command (nroff/troff) defined, GEN 5-71 if/endif commands (C shell) See also else command (C shell) See also then command (C shell) defined, GEN 4-66, 4-68 forms of, GEN 4-56 to 4-57 if/endif commands (Mail) description, GEN 2-31 restriction, GEN 2-31 if/endif commands (nroff/troff) description, GEN 5-93 to 5-94 reference list, GEN 5-52 if/endif statement (as) defined, GEN 6-59 if statement (as) See if/endif statement (as) if statement (awk) defined, PGM 3-9 defined, GEN 2-34 ik driver 4,2BSD improvement, SYS 1-16 ik.c device driver 4.2BSD improvement, SYS 5-12 Ikonas frame buffer graphics device interface See ik driver Ikonas frame buffer graphics interface See ik.c device driver il network interface driver 4.2BSD improvement, SYS 1-16 Image defined, GEN 1-26 imp network interface driver 4.2BSD improvement, SYS 1-16 IMP-11A LH/DH IMP interface See css network driver in command (me) See also ix command (me) entering, GEN 5-24 in command (nroff/troff) defined, GEN 5-62 in__cksum.c file 4.2BSD improvement, SYS 5-13 include command (M4) description, PGM 2-396 incr command (M4) description, PGM 2-395 Index-30 insertln routine indent program formatting C program source, SYS defined, PGM 4-82 install command, SYS 5-556K 1-6 install script Indention installing software, SYS 1-6 command list, GEN 5-51 int function (awk) resetting base, GEN 5-45 defined, PGM 3-8 specifying, GEN 5-24 specifyng with nroff/troff, GEN Interlan Ethernet interface See 1l network interface driver 5-62 Intermediate language (C compiler) description, PGM 2-63 to 2-66 Index See Table of contents Internet address index command (M4) binding, SYS 3-24 to 3-26 description, PGM 2-397 binding in Internet domain, SYS 3-8E Index entry specifying, GEN 5-43 binding with wildcard address, Indexing SYS 3-25E description, GEN 5-143 to 5-155 Internet port Indirect block printing, SYS 3-16E inode and, SYS 2-8 Interprocess communication init program 4.2BSD improvement, SYS 1-19 description, SYS 3-5 to 3-28 description, GEN 1-30 transferring data, SYS 3-9E Interprocess communication init__main.c file facilities contents, SYS 5-8 4.2BSD improvement, SYS 1-3 init__sysent.c file Interrupt message contents, SYS 5-8 description, GEN 3-9 initscr routine Interrupt signal defined, PGM 4-86 inode allocations states, SYS 2-11 defined, SYS 2-8 See also oninvr command (C shell) See also stty command (C shell) disk space and, SYS 2-8 creating, GEN 1-31 types of, SYS 2-11 defined, GEN 4-68 ignoring, GEN 2-36 Inode table scripts and, GEN 4-59 setting size, SYS 5-121 intro system call inode.h file 4.2BSD improvement, SYS 5-6 input defined, GEN 4-68 Input base DC and, GEN 2-62 Input mode description, GEN 3-7 Input/output See 1/0 insch routine defined, PGM 4-82 Insert command (ed) See 1 command (ed) insert command (ex) See 1 command (ex) insert command (vi) See 1 command (vi) * 4.2BSD improvement, SYS 1-10 inv program defined, GEN 5-146 description, GEN 5-147 options list, GEN 5-147 Inverted indexes See Indexing I/0 library restriction, GEN 2-15 ioct] system call 4.2BSD improvement, SYS 1-11 ioctl.h file 4.2BSD improvement, SYS 5-6 iostat reporting kilobytes per second transferred for each disk, SYS 1-6 Index-31 ip command (me) join command (ex) See also np command defined, GEN 5-40 See j command (ex) Joy, W. specifying with label, GEN 5-30 C shell introduction, GEN 4-29 to IP command (ms) indenting paragraphs, GEN 5-7 4-74 Joy, W., & Horton, M. references and, GEN 5-7E editing with vi, GEN 3-53 to 3-82 isprint library Ex Reference Manual, GEN 3-83 4.2BSD improvement, SYS 1-14 it command (nroff/troff) to 3-104 Joy, W., & Leffler, S.dJ. defined, GEN 5-65 4.2BSD on VAX/VMS, SYS 5-17 Italic See also Underlining to 5-71 Joy, W., & others bolding, GEN 5-44 4.2BSD Interprocess specifying, GEN 5-8 Communication Primer, troff and, GEN 5-66 ix command (me) SYS 3-5 to 3-28 4.2BSD System Manual, PGM defined, GEN 5-44 4-15 to 4-52 Berkeley Pascal User Manual, PGM 2-159 to 2-209 J fast file system, SYS 1-23 to 1-38 networking implementation notes, j command (ed) joining lines, GEN 3-42, 3-43E j command (ex) description, GEN 3-90 J command (vi) defined, GEN 3-79 SYS 3-29 to 3-57 Joyce, J., & Blau, R. Edit tutorial, GEN 3-3 to 3-23 Justifying (nroff/troff) command list, GEN 5-51 description, GEN 5-60 to 5-61 j number register (nroff/troff) defined, GEN 5-81 K Job defined, GEN 4-45, 4-69 determining current job, GEN 4-46 suspending, GEN 4-46 Job control command See also bg command (C shell) See also fg command (C shell) See also kill command (C shell) See also stop command (C shell) defined, GEN 4-69 Job name - beginning character, GEN 4-46 k command (DC) description, GEN 2-59 scale value and, GEN 2-60 k command (ed) marking a line, GEN 3-50E k command (ex) See also mark command (ex) description, GEN 3-90 k escape sequence (nroff/troff) description, GEN 5-68 k flag (mkey) specifying number of keys, GEN Job number defined, GEN 4-69 description, GEN 4-45 jobs command (C shell) defined, GEN 4-69 displaying jobs, GEN 4-47E Johnson, S.C. Lint command, PGM 3-39 to 3-50 tour through portable C compiler, PGM 2-37 to 2-61 Yacc, PGM 3-79 to 3-111 Index-32 5-147 k number register (nroff/troff) defined, GEN 5-81 Keep See also Floating keep defined, GEN 5-26 footnotes and, GEN 5-35 to 5-36 index entries and, GEN 5-35 to 5-36 ~ text formatting commands for, GEN 5-15E keep option (Mail) defined, GEN 2-34 Kernighan, B.W., & Cherry, L.L. typesetting mathematics, GEN 5-97 to 5-104 keepsave option (Mail) See also nosave option defined, GEN 2-35 kern__acct.c file contents, SYS 5-8 kern__clock.c file 4.2BSD improvement, SYS 5-8 kern__descrip.c file Typesetting Mathematics - User’s Guide, GEN 5-105 to 5-114 Kernighan, B.W., & Lesk, M.E. computer-naided instruction for UNIX, GEN 6-3 to 6-16 Kernighan, B.W., & others awk programming language, PGM 3-5 to 3-12 contents, SYS 5-8 kern__exec.c file contents, SYS 5-8 Kernighan, B.W., & Ritchie, D.M. M4 macro processor, PGM 2-393 to 2-398 kern__exit.c file contents, SYS 5-8 programming UNIX, PGM 1-3 to 1-24 kern__fork.c file contents, SYS 5-8 kern__mman.c file contents, PGM 2-159 to 2-209 SYS 5-8 kern__proc.c file contents, SYS 5-8 kern__prot.c file contents, SYS 5-8 kern__resource.c file contents, SYS 5-8 kern__sign.c file contents, SYS 5-8 kern__subr.c file contents, SYS 5-8 kern__synch.c file contents, SYS 5-8 kern__time.c file contents, SYS 5-8 kern__xxx.c file contents, SYS 5-8 Kernel 4.2BSD improvement, SYS 5-3 to 5-15 configuration, SYS 5-36 to 5-37 implementation, PGM 4-5 to 4-8 implementing devices, SYS 5-37 kernel.h file 4.2BSD improvement, SYS 5-5 Kernighan, B.W, advanced editing with ed, GEN 3-37 to 3-52 introduction to ed, GEN 3-25 to 3-35 Ratfor language, PGM 2-111 to 2-122 troff tutorial, GEN 5-83 to 5-96 UNIX for beginners, GEN 2-3 to 2-16 Kessler, P.B., & others Berkeley Pascal User Manual, Key defined, GEN 5-147 selected by program, GEN 5-145 Key file defined, GEN 5-145 Key letters reference list, GEN 5-152 Key-making program format used, GEN 5-145 Keyword supplementing, GEN 5-150 Keyword (BC) reserved reference list, GEN 2-50 Keyword parameter description, GEN 4-17 to 4-25 Keyword statement (as) defined, GEN 6-56 reference list, GEN 6-59 to 6-60 KF command (ms) moving blocks of text, GEN 5-9 kg driver 4.2BSD improvement, SYS 1-16 kgclock.c device driver 4.2BSD improvement, SYS 5-12 kgmon program See also gmon.out file 4.2BSD improvement, SYS 1-19 Kill character default, GEN 4-30 kill command (C shell) background commands and, GEN 4-37 background jobs and, GEN 4-47E defined, GEN 4-69 Index-33 kill command (C shell) (Cont.) killing processes, GEN 2-11 Label (as) See Name label; Numeric label suspended jobs and, GEN 4-47 killpg library routine label command (sed) defined, GEN 3-114 See killpg system call LABEL operator (C compiler) defined, PGM 2-65 killpg system call 4.2BSD improvement, SYS 1-11 KL-11 last displaying remote host, See kg driver SYS 1-6 lastcomm Kowalski, T.J., & McKusick, M.K. fsck, SYS 2-7 to 2-25 indicating program activity, SYS 1-7 KS command (ms) Layer, K., & others keeping text blocks together, GEN 5-9, 5-94E Franz Lisp Manual, The, PGM 2-211 to 2-358 lc command (nroff/troff) L defined, GEN 5-66 LCK file L argument (nroff) description, SYS 5-143 centering and, GEN 5-27 Leader character (nroff/troff) specifying, GEN 5-27 setting, GEN 5-66 1 command (DC) uninterpreted, GEN 5-66 Leadering programming DC, GEN 2-62 ] command (ed) specifying with troff, GEN 5-88 backspaces and, GEN 3-37 Leading description, GEN 3-37 long lines and, GEN 3-37 LEARN driver program p command and, GEN 3-37 defined, GEN 6-3 tabs and, GEN 3-37 description, GEN 2-6 ] command (me) directory structure, GEN 6-8 centering list elements, GEN 5-27 defined, GEN 5-42 entering, GEN 5-25 experience with students, GEN 6-8 | specifying fill mode, GEN 5-26 specifying left justification, GEN - introduction to UNIX, GEN 6-3 to 6-16 sequence of events, GEN 6-9 527 vi and, SYS 1-7 L command (vi) leaveok routine defined, PGM 4-86 Leffler, S.d. defined, GEN 3-79 1 flag (mkey) specifying items to be ignored, GEN 5-147 defined, GEN 5-81 1 option (C shell) description, GEN 2-6 I option (hunt) improvements in 4.2BSD, SYS 1-3 to 1-21 kernel and 4.2BSD, SYS 5-3 to 5-15 Leffler, S.dJ., & Joy, W.N. defined, GEN 5-148 L-devices file 4.2BSD on VAX/VMS, SYS 5-17 to 5-71 defined, SYS 5-139 L-dialcodes file Leffler, S.J., & others 4.2BSD Interprocess defined, SYS 5-139 L.sys file Communication Primer, SYS 3-5 to 3-28 contents, SYS 5-135 defined, SYS 5-141 Index-34 building 4.2BSD systems with config, SYS 5-73 to 5-105 L number register (nroff/troff) ownership of, f See Vertical spacing SYS 5-138 4.2BSD System Manual, PGM 4-15 to 4-52 fast file system, SYS 1-23 to 1-38 Leffler, S.J., & others (Cont.) networking implementation notes, SYS 3-29 to 3-57 Line drawing (nroff/troff) description, GEN 5-68 Line length (nroff/troff) left keyword (EQN), GEN 5-100E len command (M4) specifying, GEN 5-62, 5-86 Line printer description, PGM 2-397 setting for serial lines, PGM 4-101 length function (awk) setting remote, PGM 4-101 defined, PGM 3-8 Line printer control program See lpc program Leres, C., & Shoens, K. Mail Reference Manual, GEN Line Printer Dameon See lpd program 2-17 to 2-41 Lesk, M.E. Line Printer Queue program See lpq program formatting tables, GEN 5-115 to 5-131 Line printer spooling system inverted indexes, GEN 5-143 to devices supported, PGM 4-99, SYS 5-44 5-155 preparing documents with -ms, file list, GEN 5-13 to 5-16 updating publication lists, GEN Line printer spooling system (4.2BSD) 5-155 to 5-162 using —ms macros with troff and See also lpc program; pac program nroff, GEN 5-5 to 5-12 4.2BSD improvement, SYS 1-4, Lesk, M.E., & Kernighan, B.W. computer-aided instruction for UNIX, GEN 6-3 to 6-16 1-7, 1-18 controlling access, PGM 4-100 to 4101 Lesk, M.E., & Nowitz, D.A. error messages, PGM 4-103 to a dial-up network of UNIX 4-105 systems, SYS 5-123 to 5-129 filters and, PGM 4-102 setting up, PGM 4-101 to 4-102 Lesk, M.E., & Schmidt, E. user manual, PGM 4-99 to 4-105 Lex program generator, PGM 3-113 to 3-125 Line spacing Linking LG command (ms) increasing type size, GEN 5-8 Lint command checking C programs, PGM 3-39 defined, GEN 5-66 to 3-50 lint command remaking, SYS 5-120 C and, GEN 2-15 libI77.a library See {77 I/O library Life game program for, PGM 4-94E Ligature (troff) types available, GEN 5-66 limit command (C shell) displaying current limitations, GEN 4-51E setting limits, GEN 4-51E Line See Line drawing (nroff/troff) Line dot See Dot character (ed) ~ description, GEN 1-21 Ig command (troff) libc.a library | See Vertical spacing Lex program generator description, PGM 3-113 to 3-125 SYS 5-44 setting up, SYS 5-44 creating libraries from C source - code, SYS 1-7 LINT configuration file using, SYS 5-88K LINT file 4.2BSD improvement, SYS 5-11 LINTRUP request See fentl system call lisp option (ex) description, GEN 3-99 lisp option (vi) setting, GEN 3-68 Lisp program See also vlp program 4.2BSD improvement, SYS 1-7 Index-35 Lisp program (Cont.) editing with vi, GEN 3-68 log function (awk) defined, PGM 3-8 Logging in List defined, GEN 5-25 description, GEN 2-3 to 2-4 specifying in text, GEN 5-25 prerequisites, GEN 2-3 text formatting commands for, GEN 5-15E text formatting commands for nested, GEN 5-15E list command See Is command (C shell) List command (ed) See | command (ed) list command (ex) description, GEN 3-90 list command (Mail) description, GEN 2-31 list files command procedure, GEN 3-5 recording attempts, SYS 4-12 Logging out, GEN 3-8E description, GEN 2-5 Login directory startup file and, GEN 2-12 login file See also logout file background jobs and, GEN 4-48E defined, GEN 4-69 logging in and, GEN 4-39, 4-39E rlogin server and, SYS 1-7 telnetd server program and, SYS See Is command (C shell) list option (ex) description, GEN 3-99 listen system call 4.2BSD improvement, SYS 1-11 incoming requests and, SYS 3-9E 11 command (me) See also x]1 command (me) defined, GEN 5-45 11 command (nroff/troff) defined, GEN 5-62 resetting line length, GEN 5-86KE 1-7 Login shell See also Script file defined, GEN 4-69 logging in and, GEN 4-39 logout command exiting from UNIX, GEN 3-8 logout command (C shell) defined, GEN 4-69 logout file See also login file C shell and, GEN 4-39 defined, GEN 4-69 In creating symbolic links, SYS 1-7 lo command (me) London, T.B., & Reiser, J.F. regenerating system software, SYS defined, GEN 5-45 lo network interface 5-117 to 5-122 setting up UNIX/32V V1.0, SYS 4.2BSD improvement, SYS 1-16 load command (DC) See | command (DC) local command (Mail) description, GEN 2-31 Local motion defined, GEN 5-67 Location counter (as) See also bss segment 5-107 to 5-115 longjmp library old semantics and, SYS 1-15 longjump library 4.2BSD improvement, SYS 1-15 longname routine defined, PGM 4-86 lookbib command checking the data base, GEN defined, GEN 6-55 Locore.c file 4.2BSD improvement, SYS 5-13 locore.s file 4.2BSD improvement, SYS 5-14 installing device drive and, SYS 5-119 LOG file description, SYS 5-142 Index-36 5-150 Loop variables and, GEN 4-60 Low-level 1/0 description, PGM 1-8 to 1-12 Ip command (me) defined, GEN 5-40 entering, GEN 5-29 LP command (ms) specifying block paragraphs, GEN m command (ed) caution, GEN 3-50 defined, GEN 3-34 5-5 Ip.c device driver 4.2BSD improvement, SYS 5-12 Ipc program 4.2BSD improvement, SYS 1-4, 1-18, 1-19 description, PGM 4-100 Ipd program moving text, GEN 3-50K using, GEN 3-32 m command (edit) context search and, GEN 3-15 moving text, GEN 3-14 m command (ex) description, GEN 3-90 description, PGM 4-99 M command (vi) requests understood reference list, PGM 4-100 m command (vi) Ipd server program 4.2BSD improvement, SYS 1-20 Ipq program 4.2BSD improvement, SYS 1-7 description, PGM 4-100 Ipr command (C shell) defined, GEN 4-69 Ipr program lpd and, PGM 4-100 Iprm program 4.2BSD improvement description, PGM 4-100 lqg command (me) specifying quotation marks, GEN defined, GEN 3-79 defined, GEN 3-81 m escape (Mail) description, GEN 2-25 m option (nroff/troff) defined, GEN 5-49 m option (uuclean) defined, SYS 5-137 m option (uucp) defined, SYS 5-132 ml command (me) defined, GEN 5-41 m2 command (me) defined, GEN 5-41 m3d command (me) defined, GEN 5-42 5-38 Is command (C shell) 4.2 BSD improvement, SYS 1-7 defined, GEN 4-69 description, GEN 2-6 listing files in three columns, GEN 2-11 specifying numeric sort, GEN 4-32E Is command (Mail) displaying files on your terminal, GEN 2-10 m4 command (me) defined, GEN 5-42 M4 macro processor arguments, PGM 2-395 arithmetic built-ins, PGM 2-395 command line format, PGM 2-393 conditionals, PGM 2-397 defining macros, PGM 2-393 to 2-395 description, PGM 2-393 to 2-398 manipulating files, PGM 2-396 Is command (me) manipulating strings, PGM 2-397 entering, GEN 5-23 Is command (nroff/troff) operation, PGM 2-393 defined, GEN 5-61 Iseek system call 4.2BSD improvement, SYS 1-11 description, PGM 1-11 1t command (nroff/troff) defined, GEN 5-70 printing, PGM 2-397 m4 macro processor 4.2BSD improvement, SYS 1-7 machdep.c file 4.2BSD improvement, SYS 5-14 machine file 4.2BSD improvement, SYS 5-4 Machine instruction statement (as) M syntax, GEN 6-60 to 6-63 machine type parameter (config) m command (e) reversing two adjacent lines, GEN 3-50E [ T = 3 defined, SYS 5-79 Macro (M4) defining, PGM 2-393 to 2-395 Index-37 Macro (nroff) Mail (Cont.) defined, GEN 5-35 line width, GEN 2-37 defining, GEN 5-35E maintaining groups of mail, GEN naming, GEN 5-35 using, GEN 5-35E Macro (nroff/troff) arguments, GEN 5-63 2-23 message lists and user names, GEN 2-28 notification of, GEN 2-17 defined, GEN 5-62 paging, GEN 2-20 description, GEN 5-62 to 5-65 process, GEN 2-17 diversions, GEN 5-63 protecting, GEN 2-34E printing, GEN 5-73 reading, GEN 2-18 to 2-19 traps, GEN 5-64 reading in home directory, GEN Macro (troff) arguments and, GEN 5-92 to 5-93 2-21 - reading next, GEN 2-19 arguments and blanks, GEN 5-93 reading other people’s, GEN 2-36 arguments and trailing recovering deleted, GEN 2-30 punctuation, GEN 5-92 Macro (vi) saving related in a file, GEN 2-32 searching for subjects, GEN 2-28 See also Word abbreviation sending, GEN 2-18 types of, GEN 3-68 sending multiple messages, GEN Macro definition (make), PGM 3-15E defined, PGM 3-15 Macro-invocation trap (nroff/troff) description, GEN 5-64 magic option (ex) description, GEN 3-96 magic option (ex) description, GEN 3-99 Magnetic tape FORTRAN-77 and, PGM 2-84 2-28 sending remote, SYS 5-126 sending source program text, GEN 2-33 sending to file, GEN 2-27 sending to folder, GEN 2-27 sending to list, GEN 2-21 sending to multiple users, GEN 2-18 sending to other machines, GEN 2-26 to 2-27 sending to programs, GEN 2-27 Mail adding to mail list, GEN 2-25 sending to user name, GEN 2-27 answering, GEN 2-19 to 2-20 specifying mailbox, GEN 2-36 C shell watching for, GEN 4-39E terms defined, GEN 2-38 canceling, GEN 2-18 changing the subject line, GEN 2-25 commands to be executed by the shell, GEN 2-28 defined, GEN 2-38 deleting, GEN 2-20 description, GEN 2-5 filing, GEN 2-24 format, GEN 2-37 writing to others online, GEN 2-5 mail command | abbreviating, GEN 2-20 description, GEN 2-31 uses of, GEN 2-18 Mail list editing, GEN 2-25 Mail program setting up, SYS 5-44 mail program forwarding, GEN 2-25 4.2BSD improvement, SYS 1-7 holding in system mailbox, GEN defined, GEN 4-69 escaping temporarily to command 2-31 including in other mail, GEN 2-25 indicating indirect recipients, GEN 2-25 mode, GEN 2-26 escaping temporarily to shell, GEN 2-25 keeping, GEN 2-35 reading folders, GEN 2-23 keeping outgoing, GEN 2-35 reference manual, GEN 2-17 to length restricted, GEN 2-37 Index-38 2-41 mail program (Cont.) senting source program text, GEN 2-33 makelinks command source modules and, SYS 5-78 maketemp command (M4) shell and, GEN 2-32 suspending, GEN 4-37E using, GEN 2-17 to 2-41 Mail Reference Manual See also Mail program Mail routing facility See sendmail mail system See also sendmail MAIL variable description, GEN 4-11 mailaddr description, PGM 2-396 man command (Bourne shell) printing the UNIX manual, GEN 4-15 printing UNIX manual, GEN 4-16F man command (C shell) accessing online programmer’s manual, GEN 4-63E, 4-69E using, GEN 2-6 Manual defined, GEN 4-69 4.2BSD improvement, SYS 1-17 Mailbox map command (ex) See also unmap command (ex) defined, GEN 2-38 description, GEN 3-90 mailrc file, GEN 2-21E Maranzano, J.F., & Bourne, S.R. defined, GEN 2-21 ADB debugging program, PGM specifying folder directory, GEN 2-23 make command command line format, PGM 3-16 operation, PGM 3-16 to 3-17 make depend command system source code and, SYS 5-77 make directory command See mkdir command (C shell) make program See also makefile 4.2BSD improvement, SYS 1-7 C and, GEN 2-15 defined, GEN 4-69 description, PGM 3-13 to 3-21 description file for, PGM 3-18 to 3-20 setting, GEN 5-44 mark command (ex) See also k command (ex) description, GEN 3-90 Mass storage UNIX interfaces, SYS 1-36 MASSBUS description, SYS 5-18 specifying, SYS 5-19 MASTER mode description, SYS 5-135 Mathematics text formatting commands for, GEN 5-14E typesetting, GEN 5-97 to 5-104, maintaining related files, GEN 4-53 5-105 to 5-114 MAXMEM parameter operation, PGM 3-13 to 3-15 suffix list, PGM 3-17 transformation paths summary, PGM 3-17 warnings, PGM 3-20 MAKEDEYV script See also 3-51 to 3-77 Margin number MAKEDEV .local file 4.2BSD improvement, SYS 1-20 makefile See also make program defined, GEN 4-69 description, SYS 5-121 MAXUMEM parameter See also MAXMEM parameter description, SYS 5-121 MAXUPRC parameter description, SYS 5-121 maxusers parameter (config) defined, SYS 5-79 mba.c device driver 4.2BSD improvement, SYS 5-14 mbox command (Mail) description, GEN 4-53 abbreviating, GEN 2-22 modifying for uucp, SYS 5-139 description, GEN 2-31 makefile.vax file saving unread mail, GEN 2-22 contents, SYS 5-11 Index-39 Metacharacters (ed) (Cont.) mbox file delimiting text for s command, mail and, GEN 2-20 system mailbox and, GEN 2-20 4.2BSD improvement, SYS 5-5 mc command (nroff/troff) Metacharacters (ed) (ed) combining, GEN 3-40 fsck, SYS 2-7 to 2-25 McKusick, M.K., & others 4.2BSD System Manual, PGM 4-15 to 4-52 entering, GEN 3-33 reference list, GEN 3-33 searching for, GEN 3-39, 3-41 defined, GEN 5-72 McKusick, M.K., & Kowalski, T.J. | Berkeley Pascal User Manual, PGM 2-159 to 2-209 fast file system, SYS 1-23 to 1-38 McMahon, L.E. sed stream editor and, GEN 3-105 description, GEN 3-38 to 3-42 Metacharacters (ex) X and, GEN 3-96 Metacharacters (me) reference list, GEN 5-47 Metacharacters (nroff/troff) specifying, GEN 5-79 Metacharacters (troff) automatically translated, GEN to 3-114 me macro package initializing, GEN 5-40 naming convention, GEN 5-39 predefined strings, GEN 5-47 reference manual, GEN 5-39 to 5-86 command list, GEN 5-96 entering, GEN 5-86 metoo option (Mail) defined, GEN 2-35 MFLAGS macro 5-48 Me Reference Manual, GEN 5-39 See also me macro package supplying flags to make, SYS 1-7 mille game 4.2BSD improvement, SYS 1-17 mem.c file 4.2BSD improvement, SYS 5-14 Mini-root file system booting from, SYS 5-25 Memorandum text formatting commands for, GEN 5-14E mesg option (ex) description, GEN 3-99 Message GEN 3-39 editing with, GEN 3-37 to 3-43 mbuf.h file | See also Mail defined, GEN 2-38 Message list ~defined, GEN 2-28, 2-38 Metacharacters (Bourne shell) defined, GEN 4-5 copying, SYS 5-24 Minus sign translating for troff, GEN 5-86 mk command (nroff/troff) See also rt command (nroff/troff); sp command (nroff/troff) defined, GEN 5-60 mkdir command 4.2BSD improvement, SYS 1-7 creating directories, GEN 2-10 mkdir command (C shell) quoting, GEN 4-5 creating a directory, GEN 4-48 quoting a string, GEN 4-5K defined, GEN 4-70 quoting mechanisms, GEN 4-20F reference list, GEN 4-27 Metacharacters (C shell) mkdir system call 4.2BSD improvement, SYS 1-11 mkey program defined, GEN 4-69 defined, GEN 5-146 description, GEN 4-32 description, GEN 5-147 reference list, GEN 4-62 using with command arguments, GEN 4-35 Metacharacters (ed) character classes and, GEN 3-41 deleting, GEN 3-38 mkfs program See newfs program 4.2BSD improvement, SYS 1-20 mman.h file future plans and, SYS 5-5 Modifier (C shell) See also Command substitution Index-40 Modifier (C shell) (Cont.) ms macro package (Cont.) defined, GEN 4-70 printing files on the terminal, description, GEN 4-57 GEN 5-9E restriction, GEN 4-57n register name reference list, GEN 5-11 more program defined, GEN 4-70 revised version, GEN 5-17 to 5-19 paging mail, GEN 2-20 specifying column format, GEN terminal screen and, GEN 4-37 5-6 Morris, R., & Cherry, L. using with troff and nroff, GEN BC and, GEN 2-43 to 2-55 5-5 to 5-12 DC and, GEN 2-57 to 2-64 ms package Morris, R., & Thompson, K. description, GEN 2-12 password system, SYS 4-7 to 4-12 formatting a document with nroff, GEN 2-13 mos old version of -ms, GEN 5-17 formatting a document with troff, Mosher, D., & others GEN 2-12 4.2BSD System Manual, PGM MSGBUFS parameter 4-15 to 4-52 description, SYS 5-122 mount command mt unprivileged users and, SYS 4-5 showing state of tape drive, SYS 1-7 mount program 4.2BSD improvement, SYS 1-20 mtab mount.h file 4.2BSD improvement, SYS 1-16 4.2BSD improvement, SYS 5-6 Multiplication Move command (ed) DC and, GEN 2-61 See m command (ed) Multiplicative operator move command (edit) description, GEN 2-52 See m command Multitasking move command (ex) description, GEN 1-29 See m command (ex) MV command move routine renaming a file, GEN 2-7 defined, PGM 4-83 mv program mpx system call 4.2BSD improvement, SYS 1-7 See socket system call and related system calls mv program (ed) renaming a file, GEN 3-47 ms macro package mvcur routine See also -mos defined, PGM 4-88 4.2BSD improvement, SYS 1-18 mvwin routine CAlI script for, GEN 6-7 defined, PGM 4-86 command reference list, GEN N 5-11 default settings, GEN 5-9 entering cover sheet, GEN 5-5 n command (ex) entering first page, GEN 5-5 description, GEN 3-90 entering page footer, GEN 5-6 n command (sed) entering page heading, GEN 5-6 defined, GEN 3-108 entering paragraphs, GEN 5-5 N command (vi) entering section heads, GEN 5-6 See also n command (vi) keeping text blocks together, GEN 5-9 | order for input commands, GEN 5-12F preparing documents, GEN 5-13 to 5-16 : defined, GEN 3-79 n command (vi) See also N command (vi) defined, GEN 3-81 N flag (Mail) See also noheader option Index-41 N flag (Mail) (Cont.) defined, GEN 2-36 n flag (Mail) defined, GEN 2-36 n flag (make) defined, PGM 3-17 n flag (mkey) ignoring words, GEN 5-147 n flag (sed) defined, GEN 3-106 n option specifying numeric sort, GEN 4-32 n option (inv) defined, GEN 5-148 n option (nroff/troff) defined, GEN 5-49 n option (uuclean) defined, SYS 5-137 nl command (me) defined, GEN 5-44 n2 command (me) defined, GEN 5-44 Name label (as) defined, GEN 6-55 NAME operator (C compiler) defined, PGM 2-66 Named expression defined, GEN 2-51 nami routine See also nami.h file nami.h file 4.2BSD improvement, SYS 5-5 NBUF parameter description, SYS 5-121 NCALL parameter description, SYS 5-122 NCARGS parameter description, SYS 5-122 NCLIST parameter description, SYS 5-122 ND command (ms) netstat program displaying network statistics, SYS 1-7, 5-51KE displaying routing table contents, SYS 5-51E Network See Dial-up network See uucp system troubleshooting, SYS 5-57 Network data base SYS 5-48 files list, Network library routines description, SYS 3-12 to 3-16 Network name represented by netent structure, SYS 3-13E Network server program included with system, SYS 5-50T started up automatically at boot time, SYS 5-49T network server program reference list, SYS 5-49 Network Systems Hyperchannel Adapter See hy network interface driver Networking implementation, SYS 3-29 to 3-57 networks database 4.2BSD improvement, SYS 1-16 newfs program 4.2BSD improvement, SYS 1-18, 1-20 newgrp command See Group set newwin routine defined, PGM 4-86 next command (ex) See n command (ex) next command (Mail) abbreviating, GEN 2-31 See also EQN program description, GEN 2-31 next statement (awk) defined, PGM 3-9 NF variable (awk) determining number of fields, description, GEN 5-33 formatting mathematics, GEN NFILE parameter cover sheet and, GEN 5-9 ne command (nroff/troff) defined, GEN 5-59 NEQN program 2-13 net library 4.2BSD improvement, SYS 1-15 net program UNIX distribution and, SYS 1-7 Index-42 | See also mkfs program PGM 3-6 description, SYS 5-121 NH command (ms) entering section heads, GEN 5-6K specifying numbered section heads, GEN 5-6 nh command (nroff/troff) defined, GEN 5-69 NIC host data base retrieving, SYS 5-48E NINODE parameter description, SYS 5-121 nl routine defined, PGM 4-87 NLABEL operator (C compiler) defined, PGM 2-64 nm command (nroff/troff) defined, GEN 5-70 NMOUNT parameter description, SYS 5-121 nn command (nroff/troff) defined, GEN 5-70 Nobreak control character changing, GEN 5-67 noclobber variable (C shell) defined, GEN 4-70 protecting files and, GEN 4-41 NOFILE parameter description, SYS 5-121 noglob variable (C shell), GEN 4-56E defined, GEN 4-70 noheader option (Mail) See also -N flag nr command (me) (Cont.) specifying with li, GEN 5-30 nr command (nroff/troff) defined, GEN 5-65 NR variable (awk) determining current record number, PGM 3-5 nroff text processor See also nroff/troff text processor See also troff text processor calling, GEN 5-21E defined, GEN 2-12 device resolution and, GEN 5-56 entering text, GEN 5-22 formatting a document with —ms, GEN 2-13 function, GEN 5-22 invoking, GEN 5-49 stopping printer to change paper, GEN 5-49 writing papers using -me, GEN 5-21 to 5-38 nroff/troff text processor See also ~ms macros See also nroff text processor See also troff text processor -ms macros and, GEN 5-5 to 5-12 boxing words, GEN 5-69 See also quiet option breaking a line, GEN 5-60 defined, GEN 2-35 character set, GEN 5-57 nosave option (Mail) See also keepsave option defined, GEN 2-35 notify command (C shell) See also notify variable defined, GEN 4-70 reporting job complete, GEN 4-47 notify variable (C shell) See also notify command (C shell) background jobs and, GEN 4-45 Nowitz, D.A. implementing uucp, SYS 5-131 to 5-144 Nowitz, D.A., & Lesk, M.E. a dial-up network of UNIX systems, SYS 5-123 to 5-129 np command (me) character translation, GEN 5-66 concealed newlines and, GEN 5-67 contol characters beginning lines, GEN 5-60 defined, GEN 5-49 description, GEN 2-12 error messages, GEN 5-73 input, GEN 5-56 justifying text, GEN 5-61 marking horizontal space, GEN 5-68 numbering output lines, GEN 5-70 numerical expressions, GEN 5-57 numerical parameters, GEN 5-56 post processors and, GEN 5-50 defined, GEN 5-40 preprocessors and, GEN 5-50 numbering paragraphs specifying conditional input, GEN automatically, GEN 5-31E NPROC parameter description, SYS 5-121 nr command (me) indenting sections, GEN 5-32E 5-71 specifying indention, GEN 5-62 specifying line length, GEN 5-62 specifying page margins, GEN 5-7T4E Index-43 nroff/troff text processor (Cont.) specifying vertical spacing, GEN nx command (nroff/troff) defined, GEN 5-72 5-61 switching environment, GEN 5-71 transparent throughput, GEN O 5-67 transposing characters, GEN 5-67 underlining words, GEN 5-69 user’s manual, GEN 5-49 to 5-81 writing paragraph macros, GEN 5-75E Nroff/Troff User’s Manual update, GEN 5-81 Nroff/Troff User’s Manual, GEN 5-49 to 5-81 See also nroff/troff text processor ns command (nroff/troff) defined, GEN 5-62 NTEXT parameter description, SYS 5-122 nu command (edit) printing text with line numbers, GEN 3-11 nu command (ex) description, GEN 3-91 NULL defined, PGM 1-21 NULL operator (C compiler) defined, PGM 2-66 Null statement (as) defined, GEN 6-55 Number internal representation in DC, GEN 2-59 right justifying with troff, GEN 5-87 number command (DC) descripton, GEN 2-57 number command (edit) See nu command (edit) number command (ex) See nu command (ex) number option (ex) description, GEN 3-99 Number register (nroff/troff) See also nr command (nroff/troff) See also specific registers command list, GEN 5-52, 5-55 description, GEN 5-65 to 5-66 Number register (troff) description, GEN 5-91 to 5-92 predefined, GEN 5-91 Numeric label (as) defined, GEN 6-55 Index-44 o command (DC) changing the output base, GEN 2-62 description, GEN 2-59 o command (ex) See also open option description, GEN 3-91 line editing and, GEN 3-85 o command (nroff/troff) description, GEN 5-68 O command (Rogue) using, GEN 6-23 O command (vi) See also o command (vi) See also slowopen option defined, GEN 3-79 o command (vi) See also O command (vi) defined, GEN 3-81 o option (hunt) defined, GEN 5-148 o option (nroff/troff) defined, GEN 5-49 obase defined, GEN 2-44, 2-51 Octal converting to decimal, GEN 2-44 od 4.2BSD improvement, SYS 1-7 of command (me) defined, GEN 5-41 of filter calling, PGM 4-102E printers and, PGM 4-102 OF macro specifying page footers, GEN 5-19 OFS variable defined, PGM 3-6 oh command (me) defined, GEN 5-41 OH macro specifying page headings, GEN 5-19 oldcsh 4.2BSD and, SYS 1-7 onintr command (C shell) See also Interrupt signal defined, GEN 4-70 - open command (ex) over keyword (EQN) See 0o command ex) open function specifying fractions, GEN 5-99F, overlay routine See also open function description, PGM 1-10 defined, PGM 4-83 Overstrike command (nroff/troff) open option (ex) description, GEN 3-99 See o command (nroff/troff) Overstriking open system call 4.2BSD improvement, SYS 1-11 creating with troff, GEN 5-88 overwrite routine Operators defined, PGM 4-83 available, GEN 2-43 optim routine (C compiler) P description, PGM 2-66 to 2-67 optim routine (C shell) See also unoptim routine (C shell) optimize option (ex) description, GEN 3-99 Option (C shell) combining, GEN 2-6 Option (ex) See also specific options reference list, GEN 3-97 to 3-101 Option (Mail) See also specific options defined, GEN 2-38 reference list, GEN 2-33 to 2-36, 2-40T setting, GEN 2-32, 2-32E Option (nroff/troff) invoking, GEN 5-50 reference list, GEN 5-49 to 5-50 Option (vi) See also specific options listing values, GEN 3-65 reference list, GEN 3-65 p command (DC) descripton, GEN 2-58 p command (ed) defined, GEN 3-34 printing a line, GEN 3-28 printing all lines, GEN 3-28 printing last line, GEN 3-28 printing lines, GEN 3-27 stopping, GEN 3-28 using, GEN 3-27 to 3-28 p command (edit) printing buffer contents, GEN 3-10 u command and, GEN 3-16 p command (ex) description, GEN 3-91 P command (me) defined, GEN 5-46 specifying front matter, GEN 5-33 p command (sed) defined, GEN 3-111 P command (vi) setting, GEN 3-65 See also p command (vi) setting automatically, GEN 3-65 defined, GEN 3-79 options parameter (config) defined, SYS 5-79 ORS variable defined, PGM 3-6 os command (nroff/troff) defined, GEN 5-62 Ossanna, J.F. Nroff/Troff User’s Manual, GEN 5-49 to 5-81 Out of band data description, SYS 3-23 flushing 1/0 on receipt, SYS 3-23F Output defined, GEN 4-70 Output base DC and, GEN 2-62 p command (vi) See also P command (vi) defined, GEN 3-81 p escape (Mail) description, GEN 2-24 p flag (make) defined, PGM 3-17 p flag (sed) defined, GEN 3-110 p macro (me) defined, GEN 5-41 P number register (nroff/troff) defined, GEN 5-81 p option (hunt) defined, GEN 5-149 p option (inv) defined, GEN 5-148 Index-45 p option (troff) defined, GEN 5-50 p option (uuclean) defined, SYS 5-137 pa command (me) defined, GEN 5-44 pac program 4.2BSD improvement, SYS 1-18, 1-20 Paging defined, GEN 3-13 versus scrolling, GEN 3-56 Paper formatting, GEN 5-34F Paragraph, GEN 5-40 ~me restrictions, GEN 5-40 creating decorative initial capital with troff, GEN 5-86 editing with vi, GEN 3-61 Page command list, GEN 5-51 formatting the last page with a macro, GEN 5-7T7E printing specific, GEN 5-49 setting margins with nroff/troff, GEN 5-T4E specifying blank, GEN 5-44 specifying new, GEN 5-23 Page commands description, GEN 5-59 Page footer entering in text file, GEN 5-6 specifying, GEN 5-70 specifying for multiple columns with a macro, GEN 5-75E specifying with troff, GEN 5-91 varying on alternate pages, GEN 5-19 Page header entering in text file, GEN 5-6 specifying for multiple columns with a macro, GEN 5-75E specifying formats for alternating, GEN 5-71 specifying with troff, GEN 5-90 Page heading specifying, GEN 5-70 varying on alternate pages, GEN 5-19 Page layout specifying, GEN 5-23 Page number setting arabic, GEN 5-44 setting roman, GEN 5-44 specifying, GEN 5-59, 5-91 specifying for appendix, GEN 5-46 specifying for chapter, GEN 5-46 Page offset (nroff/troff) specifying, GEN 5-59 Page trap (nroff/troff) description, GEN 5-64 pagesize program printing system page size, SYS 1-7 Index-46 entering in text file, GEN 5-5 indenting, GEN 5-7 to 5-8 numbering automatically, GEN 5-31 specifying, GEN 5-22 specifying block format, GEN 5-29 specifying hanging indent format, GEN 5-29 specifying hanging indent format with a macro, GEN 5-75KE specifying indention, GEN 5-30 specifying indention amount, GEN 5-39E vi definition, GEN 3-61 writing a macro for, GEN 5-75E paragraph option (ex) description, GEN 3-99 param.c file contents, SYS 5-11, 5-103 param.h file See also kernel.h file 4.2BSD improvement, SYS 5-6, 5-13 Parentheses (BC) primitive expression and, GEN 2-51 Parentheses (EQN) typesetting in proper size, GEN 5-100K Pascal programming language See Berkeley Pascal programming language Passive system defined, SYS 5-123 passwd concurrent updates to password file and, SYS 1-8 Password entering, GEN 3-5 Password entry program predictable passwords and, SYS 4-10 random numbers and, SYS 4-11 Password file Phototypesetting restricting users, GEN 1-31 security and, SYS 4-8 Password system See nroff/troff text processor PHYSPAGES parameter description, SYS 5-121 history, SYS 4-7 to 4-12 Pasting and cutting pi command (nroff) defined, GEN 5-72 See m command (ed) PATH variable (Bourne shell) description, GEN 4-11 to 4-12 path variable (C shell) See also rehash command (C shell) Picture System 2 graphics device See ps driver piles program (EQN) description, GEN 5-100 Pipe defined, GEN 1-26, 2-11, PGM default value, GEN 4-40 defined, GEN 4-40, 4-70 1-14 description, GEN 2-11, PGM 1-14 Pathname to 1-17 See also Absolute pathname defined, GEN 2-9, 4-71 description, GEN 4-33 Pattern (awk) optimal size, SYS 1-28 programs and, GEN 2-11 pipe system call description, PGM 1-15 to 1-17 description, PGM 3-6 to 3-7 Pattern space defined, GEN 3-106 pe 4.2BSD improvement, SYS 1-8 pc command (nroff/troff) defined, GEN 5-70 Pipeline, GEN 4-4E combining command input/output, GEN 4-32 defined, GEN 2-11, 4-4, 4-71 description, GEN 4-32 to 4-33 elements in, GEN 2-11 files read from terminal and, GEN pc/pi 2-11 4.2BSD improvement, SYS 1-8 pch.h file defined, GEN 5-59 4.2BSD improvement, SYS 5-14 pcl network interface driver 4.2BSD improvement, SYS 1-16 pd command (me) Plain data block defined, SYS 2-12 pm command (nroff/troff) defined, GEN 5-73 pn command (nroff/troff) defined, GEN 5-43 pdx debugger defined, GEN 5-59 pi and, SYS 1-8 po command (nroff/troff) defined, GEN 5-59 Period See Dot character (ed) perror function setting left margin, GEN 5-86E Peint size description, PGM 1-12 perror library changing, GEN 5-38, 5-58 defaults, GEN 5-38 4.2BSD improvement, SYS 1-15 pg flag setting, GEN 5-84 pop directory command collecting information for gprof, SYS 1-5 See popd command (C shell) popd command (C shell) See also pushd command (C shell) pg option creating images for gprof, SYS 1-6 phones database pl command (nroff/troff) | See also tip program 4.2BSD improvement, SYS 1-17 Phototypesetter defined, GEN 4-71 . without argument, GEN 4-49 Port defined, GEN 4-71 Port number defined, GEN 5-98 algorithm for selecting, SYS 3-26 stopping automatically to reload, overriding selection algorithm, GEN 5-49 SYS 3-26E Index-47 Portable C Compiler printcap file description, PGM 2-37 to 2-61 Posting file | 4.2BSD improvement, SYS 1-17 creating, PGM 4-101 printenv command (C shell) See also setenv command (C defined, GEN 5-145 Pound sign shell) See Sharp character defined, GEN 4-71 pp command (me) printf function See also ip command (me) See also fprintf function output and, PGM 1-4 See also lp command (me) defined, GEN 5-40 printf statement (awk) formatting output, PGM 3-6 printw routine defined, PGM 4-83 proc.h file description, GEN 5-22 meaning of, GEN 2-12 pr command (C shell) defined, GEN 4-71 printing files, GEN 2-7 printing files in three columns, See also ps command (C shell) See also System process See also User process defined, GEN 1-26, 4-71 maximum active, SYS 5-121 maximum per user, SYS 5-121 pre command (edit) recovering files, GEN 3-22 Preface formatting, GEN 5-34F Preliminary text See Front matter setting maximum files for, SYS preserve command (edit) See pre command (edit) 5-121 preserve command (ex) description, GEN 3-91 preserve command (Mail) See also hold command (Mail) abbreviating, GEN 2-22 description, GEN 2-31 keeping mail in your system mailbox, GEN 2-21 primes program 4.2BSD improvement, SYS 1-17 Primitive expression | description, GEN 2-51 Print command See p command print command (awk) description, PGM 3-6 print command (edit) See p command (edit) print command (ex) See p command (ex) print command (Mail) See also ignore command (Mail) description, GEN 2-29 ignored fields and, GEN 2-31 Print file UNIX and, PGM 2-83 print working directory command See pwd command (C shell) 4.2BSD improvement, SYS 5-7 Process GEN 2-11 space for, SYS 5-121 stopping, GEN 2-11 syncronizing, GEN 1-27 terminating, GEN 1-27 Process control data structure, PGM 4-6F description, PGM 4-5 to 4-6 Process number defined, GEN 2-11 determining, GEN 2-11 Process stack setting growth increment, SYS 5-121 setting initial size, SYS 5-121 Process time accounting summarizing, SYS 5-56 PROFIL operator (C compiler) defined, PGM 2-65 profil system call 4.2BSD improvement, SYS 1-12 profile file login and, GEN 4-6 shell and, GEN 2-12 Profiled system description, SYS 5-78 PROG operator (C compiler) defined, PGM 2-64 Program See also Command (C shell) Index-48 Program (Cont.) ps command (troff) (Cont.) defined, GEN 3-3, 4-71 setting point size, GEN 5-84 editing with vi, GEN 3-67 ps driver executing, GEN 1-26 4.2BSD improvement, SYS 1-16 ps.c device driver executing from another, PGM 1-12 | 4.2BSD improvement, SYS 5-12 maintaining with make, PGM PS1 variable 3-13 to 3-21 defined, GEN 4-12 running simultaneously, GEN PS2 variable 2-11 defined, GEN 4-12 running two with one command Pseudo device line, GEN 2-11 specifying, SYS 5-82 saving output, GEN 2-11 Pseudo terminal setting maximum executing, SYS creating, 5-122 stopping, GEN 2-4, 2-11 remote login sessions and, SYS Programmer’s manual 3-24 See Manual Pseudo-font Programming description, GEN 5-37 reading list, GEN 2-16 restriction, GEN 5-37 tools for, GEN 2-14 to 2-15 psignal library translating a language, GEN 2-15 4.2BSD improvement, SYS 1-15 Prompt pstat program defined, GEN 4-71 4.2BSD improvement, SYS 1-20 Prompt character ptx program defined, GEN 2-4 defined, GEN 2-13 prompt option (ex) pty driver description, GEN 3-99 4.2BSD improvement, SYS 1-16 Protection mode pu command (ex) description, PGM 1-10 description, GEN 3-91 Proteon proNET ring network Publication list controller indexing, GEN 5-143 to 5-155 See vv network interface driver Protocol name represented by protoent structure, SYS 3-13, 3-14E protocol switch table See also protosw.h file protocols database 4.2BSD improvement, SYS 1-17 protosw.h file 4.2BSD improvement, SYS 5-5 ps command (C shell) See also Process 4.2BSD improvement, SYS 1-8 defined, GEN 4-72 determining the process number, GEN 2-11 displaying all programs running, GEN 2-11 displaying unstarted background jobs, GEN 4-48 ps command (troff) SYS 5-48E description, SYS 3-24 updating, GEN 5-155 to 5-162 - pup—cksum.c file 4.2BSD improvement, SYS 5-13 purchar function output and, PGM 1-4 push directory command See pushd command (C shell) push directory command (C shell) See pushd command pushd command (C shell) See also cd command (C shell) See also popd command (C shell) defined, GEN 4-70 saving name of previous directory, GEN 4-49 without argument, GEN 4-49 put command (ex) See pu command (ex) putc macro See also fflush function defined, PGM 1-6 defined, GEN 5-58 Index-49 pwd command (C shell) See also dirs command (C shell) 4.2BSD improvement, SYS 1-8 defined, GEN 4-72 print your directory name, GEN 2-9 working directory pathname and, GEN 4-48E PX macro description, GEN 5-18 quit command (edit) See q command (edit) quit command (ex) See q command (ex) quit command (Mail) abbreviating, GEN 2-22 description, GEN 2-31 saving typed mail, GEN 2-22 Quit signal defined, GEN 4-72 terminating a program, GEN 4-37 quit statement (BC) Q description, GEN 2-55 guot program Q command quitting ed, GEN 2-6 q command (DC) descripton, GEN 2-58 q command (ed) 4.2BSD improvement, SYS 1-20 Quota exceeding, GEN 3-22 Quota file comparing with allocated disk defined, GEN 3-34 space, SYS 2-4 using, GEN 3-26 description, SYS 2-5 q command (edit) exiting without saving edits, GEN 3-13 using, GEN 3-8 q command (ex) See also wq command (ex) description, GEN 3-91 q command (me) defined, GEN 5-42, 5-44 entering, GEN 5-25 specifying quoted text, GEN 5-38 q command (sed) defined, GEN 3-114 Q command (vi) defined, GEN 3-79 q flag (make) Quota system See Disk quota system quota system call 4.2BSD improvement, SYS 1-12 quota.h file 4.2BSD improvement, SYS 5-5 quota__kern.c file contents, SYS 5-9 quota__subr.c file contents, SYS 5-9 quota__sys.c file contents, SYS 5-9 quota__ufs.c file contents, SYS 5-9 quotacheck program 4.2BSD improvement, SYS 1-20 defined, PGM 3-17 guotaon program q option (nroff/troff) See also quotaoff defined, GEN 5-49 qsort library 4.2BSD improvement, SYS 1-15 Question mark character (C shell) description, GEN 4-34 Question mark character (DC) description, GEN 2-59 pattern matching and, GEN 2-8 Question mark character (ed) context search and, GEN 3-43 quiet option (Mail) See also noheader option defined, GEN 2-35 Quit command (ed) See q command (ed) Index-50 4.2BSD improvement, SYS 1-20 Quotation defined, GEN 4-72 setting apart, GEN 5-25 Quotation marks (C shell) using metacharacters in command arguments, GEN 4-35 Quotation marks (me) making compatible for printers and typesetters, GEN 5-38 translating for typesetter, GEN 5-38 Quotation marks (ms) translating for typesetter, GEN 5-19 Quotation marks (nroff) RAS8]1 disk drive specifying font, GEN 5-36 Quotation marks (troff) See uda driver Rand MH system translating, GEN 5-86 Quoted string statement (BC) mail program and, SYS 1-7 random library forming, GEN 2-54 4.2BSD improvement, SYS 1-15 Ratfor language See also EFL programming R language See also M4 macro processor r command (ed) defined, GEN 3-34 using, GEN 3-27 without line address, GEN 3-49 r command (edit) description, GEN 3-22 r command (ex) description, GEN 3-91 r command (me) defined, GEN 5-44 specifying roman font, GEN 5-36 R command (ms) restoring regular font, GEN 5-8 r command (sed), GEN 3-112E defined, GEN 3-112 R command (vi) See also r command (vi) defined, GEN 3-79 r command (vi) See also R command (vi) uefind, GEN 3-81 r ercape (Mail) description, GEN 2-24 r flag (cp) file system tree and, SYS 1-5 r flag (Mail) defined, GEN 2-36 C and, GEN 2-15 description, PGM 2-111 to 2-122 Raw device description, SYS 5-20 raw routine defined, PGM 4-85 Raw socket See also Datagram socket defined, SYS 3-6 rb command (me) defined, GEN 5-44 RC command (me) defined, GEN 5-46 rc program 4.2BSD improvement, SYS 1-20 rcexpr routine arguments, PGM 2-68 rcp program cp support and, SYS 1-8 rd command (nroff/troff) defined, GEN 5-72 rdump program See also rmt program 4.2BSD improvement, SYS 1-18, - 1-20 re command (me) defined, GEN 5-45 r flag (make) Read command (ed) defined, PGM 3-17 r modifier (C shell) read command (edit) extracting filename root, GEN 4-57E r option (edit) recovering files, GEN 3-23 r option (nroff/troff) defined, GEN 5-49 r option (uucp) defined, SYS 5-132 r option (uux) description, SYS 5-133 RAG60 disk drive See uda driver RAB80 disk drive See r command (ed) See r command (edit) read command (ex) See r command (ex) read function description, PGM 1-9 Read only mode (ex) description, GEN 3-85 read system call 4.2BSD improvement, SYS 1-12 Read-ahead ~ description, GEN 2-4 readlink system call 4.2BSD improvement, SYS 1-12 See uda driver Index-51 rehash command (C shell) readv system call 4.2BSD improvement, SYS 1-12 and, GEN 4-40 defined, GEN 2-35 defined, GEN 4-72 recover command (edit) required for current path, GEN description, GEN 3-22 4-51 recover command (ex) description, GEN 3-92 recv system call 4.2BSD improvement, SYS 1-12 Reiser, J.F., & Henry. R.R. Berkeley VAX/UNIX Assembler Reference Manual, PGM 4-53 to 4-65 previewing data, SYS 3-10 Reiser, J.F., & London, T.B. transferring data, SYS 3-9E recvfrom system call See also path variable adding commands to directory record option (Mail) | 4.2BSD improvement, SYS 1-12 receiving data, SYS 3-10E recvmsg system call See also sendmsg system call 4.2BSD improvement, SYS 1-12 Redirection defined, GEN 4-72 redraw option (ex) description, GEN 3-99 refer program See also Refer system.if ref output, GEN 5-152E placing a reference in a paper, GEN 5-150 Refer system See also addbib utility See also Indexing 4.2BSD improvement, SYS 1-8 regenerating system software, SYS 5-117 to 5-122 setting up UNIX/32V V1.0, SYS 5-107 to 5-115 Relational operator description, GEN 2-53 form, GEN 2-47 Relative pathname See also Absolute pathname defined, GEN 4-72 Reliably delivered message socket (unsupported) defined, SYS 3-6 Remainder DC and, GEN 2-61 remap option (ex) description, GEN 3-99 remote database See also tip program description, GEN 5-133 to 5-142 4.2BSD improvement, SYS 1-17 formatting bibliographic citations, Remote login program, SYS 3-15F GEN 2-13 Reference formatting, GEN 5-151 overriding numbering, GEN 5-155 private file of, GEN 5-155 Reference file defined, GEN 5-151 refresh routine defined, PGM 4-83 Register changing for text formatting, GEN 5-16 used by -ms reference list, GEN 5-11 regtab table defined, PGM 2-68 Remote login server program main loop, SYS 3-18F pseudo terminals and, SYS 3-24 Remote system calling, SYS 5-125 rename system call x 4.2BSD improvement, SYS 1-12 description, SYS 1-35 renice program 4.2BSD improvement, SYS 1-20 reorder routine | description, PGM 2-76 to 2-77 repeat command (C shell) defined, GEN 4-72 repeating a command, GEN 4-51 Reply command (Mail) description, GEN 3-96 to 3-97 See also reply command (Mail) abbreviating, GEN 2-20 answering mail, GEN 2-19 reference list, GEN 3-96 answering the sender only, GEN Regular expression (ex) defined, GEN 3-96 2-20 Index-52 Reply command (Mail) (Cont.) definition, GEN 2-29 reply command (Mail) See also Reply command (Mail) description, GEN 2-32 report option (ex) description, GEN 3-100 repguota program 4.2BSD improvement, SYS 1-20 Request (nroff) See Command (nroff) Reserved word reference list, GEN 4-27 reset command include file and, SYS 1-8 resource.h file 4.2BSD improvement, SYS 5-5 restart command (Ipc) description, PGM 4-103 restor program See restore program restore program See also rrestore 4.2BSD improvement, SYS 1-18 restore server program See also tar program RETRN operator (C compiler) defined, PGM 2-65 RETURN key commands and, GEN 2-4 description, GEN 3-55 moving the cursor in vi, GEN 3-57 return statement (BC) form of, GEN 2-46 forming, GEN 2-55 rew command (ex) description, GEN 3-92 rewind command (ex) See rew command (ex) ~ rexecd server program 4.2BSD improvement, SYS 1-20 rhosts file description, SYS 5-49 Ritchie, D.M. C Programming Language Reference Manual, The, PGM 2-5 to 2-35 I/0 system, PGM 4-67 to 4-73 standard I/0O library, PGM 1-21 to 1-24 system security, SYS 4-3 to 4-5 tour through C compiler, PGM 2-63 to 2-77 Ritchie, D.M. (Cont.) UNIX Assembler Reference Manual, GEN 6-53 to 6-64 Ritchie, D.M., & Kernighan, B.W, M4 macro processor, PGM 2-393 to 2-398 programming UNIX, PGM 1-3 to 1-24 Ritchie, D.M., & Thompson, K. implementation of file system and user command interface, GEN 1-19 to 1-34 rk.c device driver 4.2BSD improvement, SYS 5-12 RKO7 disk See va driver rl option (uucico) defined, SYS 5-135 rl.c device driver 4.2BSD improvement, SYS 5-12 RL11 controller See rl.c device driver RLABEL operator (C compiler) defined, PGM 2-65 rlogin server program Jogin file and, SYS 1-7 cu program and, SYS 1-8 description, SYS 1-8 rlogind server program 4.2BSD improvement, SYS 1-20 rm command (nroff/troff) defined, GEN 5-64 rm command (shell) deleting files, GEN 2-7 recover command (edit) and, GEN 3-22 removing a file, GEN 3-48E rmdir command 4.2BSD improvement, SYS 1-8 rmdir system call 4.2BSD improvement, SYS 1-12 rmt program 4.2BSD improvement, SYS 1-20 rn command (nroff/troff) defined, GEN 5-64 RNAME operator (C compiler) defined, PGM 2-65 ro command (me) defined, GEN 5-44 roffbib program bibliographic databases and, SYS 1-8 | rogue game 4.2BSD improvement, SYS 1-17 Index-53 rsh server program rogue game (Cont.) executing remote commands, SYS command reference list, GEN 1-8 6-19 to 6-21 displaying top players, GEN 6-25 rshd server program 4.2BSD improvement, SYS 1-20 fighting, GEN 6-21 objects you can find, GEN 6-21 rsp.h file option reference list, GEN 6-24 4.2BSD improvement, SYS 5-13 rt command (nroff/troff) See also mk command (nroff/troff); sp command (nroff/troff) | playing, GEN 6-17 to 6-25 rooms, GEN 6-21 sample screen, GEN 6-18F scoring, GEN 6-24 screen layout, GEN 6-18 to 6-19 defined, GEN 5-60 screen symbol reference list, GEN RUBOUT character ignoring while sending mail, GEN 6-19 2-34 setting options, GEN 6-23 RUBOUT key ROGUEOPTS variable See DELETE key using, GEN 6-23 Ruling Roman number setting page number, GEN 5-44 specifying, GEN 5-88 specifying for front matter, GEN specifying for figure, GEN 5-45 specifying in text, GEN 5-26 5-33 with tab character, GEN 5-87E Root directory defined Ruling (nroff/troff) outside text margin, GEN 5-72 description, GEN 1-21 Running foot Root file system See Page footer block size, SYS 5-40 Running head dump and, SYS 5-54 See Page header rebuilding, SYS 5-32 Runtime routine (C) handling network addresses and restoring, SYS 5-26 route program 4.2BSD improvement, SYS 1-20 description, SYS 5-51 See also rwhod server program routed server program 4.2BSD improvement, SYS 1-20 output, SYS 3-20E RP command (ms) specifying cover sheet, GEN 5-5 displaying users on clusters, SYS bad block forwarding support, | rr command (nroff/troff) defined, GEN 5-66 rrestore program See also rmt program 4.2BSD improvement, SYS 1-20 RS command (ms) specifying indention level, GEN rs command (nroff/troff) defined, GEN 5-62 RS variable (awk) defined, PGM 3-6 rsh command See also rshd server program Index-54 rwho program See also rwhod server program RPO06 disk 5-17 displaying status for cluster, SYS 1-8 description, SYS 5-51 SYS 1-18 values, SYS 3-15T ruptime program 1-8 rwho server program description, SYS 3-20 to 3-22 simplified form, SYS 3-21F rwhod server program 4.2BSD improvement, SYS 1-21 rx driver 4.2BSD improvement, SYS 1-16 rx.c device driver 4.2BSD improvement, SYS 5-12 RX02 floppy disk unit See rx driver rx1 flag (me) setting 12 pitch, GEN 5-39 RX211 floppy disk controller See rx.c device driver rxformat program 4.2BSD improvement, SYS 1-21 s option (nroff/troff) defined, GEN 5-49 s option (uucico) defined, SYS 5-135 s option (uucp) S defined, SYS 5-132 s option (uulog) s command (DC) affecting register content, GEN 2-62 descripton, GEN 2-58 destructive, GEN 2-63 programming DC, GEN 2-62 s command (ed) ampersand character and, GEN 3-34 breaking lines, GEN 3-42 changing all occurrences, GEN defined, SYS 5-137 sail game 4.2BSD improvement, SYS 1-17 save command (Mail) See also write command (Mail) abbreviating, GEN 2-32 system mailbox and, GEN 2-23 SAVE operator (C compiler) defined, PGM 2-65 savehist variable saving history across terminal 3-30 changing every occurrence, GEN 3-38K defined, GEN 3-34 deleting text, GEN 3-30 delimiters, GEN 3-30 sessions, SYS 1-5 savetty routine defined, PGM 4-88 sc command (me) defined, GEN 5-47 Scale description, GEN 3-37 to 3-38 defined, GEN 2-45, 2-51 g command and, GEN 3-46E increasing value, GEN 2-45E g command restriction and, GEN 3-47 limits, GEN 2-45 printing current value, GEN rearranging a line, GEN 3-43 undoing the last substitution, GEN 3-38 using, GEN 3-29 s command (edit) replacing text, GEN 3-11 2-45K rules for, GEN 2-45 Scale factor defined, GEN 2-59 Scale indicator attaching to numbers for troff, uppercase letters and, GEN 3-19 s command (ex) See also & command (ex) description, GEN 3-92 S command (vi) defined, GEN 3-79 s command (vi) defined, GEN 3-81 s escape (Mail) description, GEN 2-25 s flag (In) creating symbolic links, SYS 1-7 s flag (Mail) defined, GEN 2-36 s flag (make) defined, PGM 3-17 s flag (mkey) ignoring labels, GEN 5-147 s macro (me) defined, GEN 5-43 GEN 5-92 Scale register description, GEN 2-60 Scaling BC language and, GEN 2-45 scanf function See also fscanf function input and, PGM 1-4 scanw routine defined, PGM 4-85 SCCS introduction, PGM 3-23 to 3-37 Schmidt, E., & Lesk, M.E. Lex program generator, PGM 3-113 to 3-125 Scratch character creating a scratch file, GEN 4-31 Scratch file creating, GEN 4-31 v defined, GEN 4-72 Index-55 scrollok routine Scratch file (Cont.) Fortran and, PGM 2-83 Screen (Screen package) ~ defined, PGM 4-87 sdb symbolic debugger defined, PGM 4-75 See also dbx symbolic debugger updating, PGM 4-92K accessing symbol information, updating, PGM 4-76 to 4-77 SYS 1-5 locating, SYS 1-8 Screen (vi) breaking lines at right margin, GEN 3-67 controlling window size, GEN support, SYS 1-6 search command (edit) See Context search (edit) Search path 3-65 See PATH variable refreshing, GEN 3-64 Section Screen editor invoking from Mail, GEN 2-24 editing with vi, GEN 3-61 screen option (Mail) indenting, GEN 5-32E defined, GEN 2-35 vi definition, GEN 3-62 Section head Screen package description, PGM 4-75 to 4-98 input functions, PGM 4-78 reference list, PGM 4-84 to 4-85 chapter numbers, GEN 5-41 entering in text file, GEN 5-6 indenting, GEN 5-7E miscellaneous functions reference list, PGM 4-85 to 4-88 output functions, PGM 4-78 reference list, PGM 4-80 to 4-84 prerequisites, PGM 4-75 starting, PGM 4-77 terminal information and, PGM 4-T79 coordinating numbers with : Script See also Script file script 4.2BSD improvement, SYS 1-8 Script file, GEN 4-55E See also Login shell See also make command (C shell) break statement and, GEN 4-58 commands useful to writers of, GEN 4-53 comments in, GEN 4-59 creating, GEN 2-10, 3-52E defined, GEN 3-51, 4-53, 4-72 interrupts and, GEN 4-59 invoking, GEN 4-53 numbering automatically, GEN 5-31 to 5-32, 5-40 to 5-41 numbering automatically with a macro, GEN 5-75E specifying beginning number, GEN 5-32K specifying unnumbered, GEN 5-32K text formatting commands for, GEN 5-14E sections option (ex) description, GEN 3-100 Security dial-up network and, SYS 5-125 UNIX and, SYS 4-3 to 4-5 uucp system and, SYS 5-138 sed stream editor address types, GEN 3-107 to 3-108 command line format, GEN 3-1056E defined, GEN 2-13, 3-52 making executable, GEN 4-53 description, GEN 3-105 to 3-114 preventing variable substitution ed and, GEN 3-105 by the shell, GEN 4-59 functions, GEN 3-108 to 3-114 shell input and, GEN 4-58 operation, GEN 3-105 to 3-106 taking commands from a file, Script.out file creating, GEN 2-11 scroll routine defined, PGM 4-88 Scrolling versus paging, GEN 3-56 Index-56 GEN 3-52E uses, GEN 3-105 seek function See also lseek description, PGM 1-12 services database select system call 4,2BSD improvement, SYS 1-12 multiplexing I/O requests, SYS 4.2BSD improvement, SYS 1-17 set command (C shell) C shell variables and, GEN 4-40K 3-11E defined, GEN 4-72 Semicolon character (ed) compared with comma, GEN 3-45 setting dot, GEN 3-45 to 3-46 set command (ex) description, GEN 3-92 set command (Mail) send system call 4.2BSD improvement, SYS 1-12 See also unset command (Mail) transferring data, SYS 3-9E forms of, GEN 2-20 options and, GEN 2-32 sendbug program restriction, GEN 2-21 See also bugfiler program submitting 4.2BSD bug reports, Set-GID bit sendmail installation and operation guide, description, SYS 4-4 security and, SYS 4-5 SYS 2-27 to 2-60 Sendmail Installation and Operation Set-UID bit description, SYS 4-4 Guide, SYS 2-27 to 2-60 security and, SYS 4-5 See also sendmail setbuf library routine sendmail option (Mail) See also setbuffer library routine defined, GEN 2-35 setbuffer library routine sendmail program See also setbuf library routine See also mailaddr 4.2BSD improvement, SYS 1-14 See also sendmail option See also syslog server program 4.2BSD improvement, SYS 1-4, setenv command (C shell) See also printenv command (C shell) 1-21 implementing aliases, GEN 2-21 defined, GEN 4-73 setting variables in environment, sendmsg system call See also recvmsg system call 4.2BSD improvement, SYS 1-12 sendto primitive / sending data, SYS 3-10E GEN 4-51E setgid system call See setregid system call Sethi-Ullman algorithm C compiler and, PGM 2-69 to sendto system call 4.2BSD improvement, SYS 1-12 Sentence Set terminal options command See stty command (C shell) SYS 1-8 | editing with vi, GEN 3-61 vi definition, GEN 3-61 Sequenced packet socket (unsupported) defined, SYS 3-6 Server process 2-170 setifaddr program 4.2BSD improvement, SYS 1-21 setlinebuf library routine 4.2BSD improvement, SYS 1-14 setquota system call 4.2BSD improvement, SYS 1-12 SETREG operator (C compiler) See also Client process defined, PGM 2-65 description, SYS 3-17 setregid system call Service name represented by the servent structure, SYS 3-14 Service process See also Service server Service server See also Xerox Courier protocol description, SYS 3-17 4.2BSD improvement, SYS 1-12 setreuid system call 4.2BSD improvement, SYS 1-12 setterm routine defined, PGM 4-88 setuid system call See setreuid system call SFCON operator (C compiler) defined, PGM 2-66 Index-57 SG command (ms) specifying signature line, GEN 5-9 sh command (ex) description, GEN 3-92 sh command (me) See also uh command (me) defined, GEN 5-40 numbering section heads, GEN 5-31 to 5-32 SH command (ms) specifying unnumbered section head, GEN 5-6 sh program See Bourne shell Shared lock multiple processes and, SYS 1-3 Sharp character printing, GEN 3-39 Shell program (Cont.) reading a file for commands, GEN 2-12 specifying for Mail, GEN 2-20 Shell script See Script file shiftwidth option (ex) description, GEN 3-100 Shoens, K., & Leres, C. Mail Reference Manual, GEN 2-17 to 2-41 showmatch option (ex) description, GEN 3-100 showmatch option (vi) lisp and, GEN 3-68 shutdown system call 4.2BSD improvement, SYS 1-12 data pending and, SYS 3-10K entering in text, GEN 2-4 sigblock system call 4.2BSD improvement, SYS 1-12 erasing last character typed, GEN SIGCHLD signal Sharp character (#) 2-4 shell comments and, GEN 4-57 Shell See also C shell See Bourne shell defined, GEN 4-73 description, GEN 1-27 to 1-31 implementing, GEN 1-29 shell command (ex) See sh command (ex) shell command (Mail) ‘See also SHELL option description, GEN 2-32 executing Shell command from Mail, GEN 2-22 shell option (ex) description, GEN 3-100 SHELL option (Mail) defined, GEN 2-33 setting, GEN 2-32 specifying, GEN 2-20 Shell procedure debugging, GEN 4-15 defined, GEN 4-7 description, GEN 4-7 to 4-16 Shell program definition, GEN 2-11 description, GEN 2-11 to 2-12 escaping to from Mail, GEN 2-25 profile file and, GEN 2-12 programming aids, GEN 2-14 as programming language, GEN 2-14 Index-58 constructing server processes, SYS 3-27 reaping child processes, SYS 3-28E SIGIO signal 4.2BSD improvement, SYS 1-13, 5-7 interrupt-drive I/0 and, SYS 3-27 Signal defined, GEN 4-73 description, PGM 1-17 to 1-20 handling methods, GEN 4-22 Signal facilities 4.2BSD improvement, SYS 1-3 signal function descripton, PGM 1-17 to 1-20 signal.h file 4.2BSD improvement, SYS 5-7 signals and, PGM 1-17 Signataure line specifying, GEN 5-9 sigpause system call 4.2BSD improvement, SYS 1-12 SIGPROF signal 4.2BSD improvement, SYS 1-13, 5-17 sigsetmask system call 4.2BSD improvement, SYS 1-12 sigstack system call 4.2BSD improvement, SYS 1-12 sigsys system call See signal facilities SIGTINT signal See SIGIO signal SIGURG signal 4.2BSD improvement, SYS 1-13, Socket (Cont.) optimal size, SYS 1-28 process group and, SYS 3-23 types of, SYS 3-6 Socket name 5-7 out of band data and, SYS 3-27 binding to UNIX domain socket, SYS 3-8E sigvec system call 4.2BSD improvement, SYS 1-13 SIGVTALRM signal 4.2BSD improvement, SYS 1-13, Socket system call creating a socket, SYS 3-7E socket system call 5-7 sinclude command (M4) description, PGM 2-396 SINCR parameter description, SYS 5-121 Singlespacing specifying, GEN 5-23 size keyword (EQN) changing point size, GEN 5-100 sk command (me) ~defined, GEN 5-44 Sklower, K.L., & others Franz Lisp Manual, The, PGM 2-211 to 2-358 | Slash description, SYS 3-7 See Backslash Slow terminal editing on, GEN 3-64 4.2BSD improvement, SYS 1-13 failure, SYS 3-7 socket.h file 4.2BSD improvement, SYS 5-5 socketpair system call 4.2BSD improvement, SYS 1-13 socketvar.h file 4.2BSD improvement, SYS 5-5 Soft limit defined, SYS 2-3 Software maintenance using network for, SYS 5-127 SOH See Leader character (nroff/troff) sort program defined, GEN 2-13, 4-73 specifying numeric sort, GEN vi and, GEN 3-74 4-32K slowopen option (ex) sortbib command description, GEN 3-100 SM command (ms) decreasing type size, GEN 5-8 SMAPSIZ parameter ~description, SYS 5-122 SMTP See DARPA Simple Mail Transfer Protocol SNAME operator (C compiler) defined, PGM 2-65 so command (ex) See so command (ex) description, GEN 3-92 so command (nroff/troff) defined, GEN 5-72 interpolating file name, GEN 5-81 SO_DEBUG option network and, SYS 5-57 Socket sorting bibliographic databases and, SYS 1-9 Source Code Control System See SCCS source command description, GEN 2-32 source command (C shell) defined, GEN 4-73 effecting changes to .chshrc immediately, GEN 4-51 Source file locating reference list, SYS 5-117 Source management system defined, PGM 3-23 sp command (me) See also bl command (me) entering, GEN 5-23 sp command (nroff/troff) binding, SYS 3-7 defined, GEN 5-62 creating, SYS 3-7 setting, GEN 5-84 description, SYS 3-6 to 3-11 discarding, SYS 3-10, 3-10E Space character edit and, GEN 3-7 naming, SYS 3-6 Index-59 Special character Standard output file See Metacharacters description, PGM 1-6 searching, GEN 3-21 standout routine Spell defined, PGM 4-84 defined, GEN 2-13 Star See Asterisk character detecting spelling errors, GEN start command (Ipc) 2-13 sprintf function description, PGM 4-103 See also fprintf function Startup file description, PGM 1-8 running, GEN 2-12 stat system call sprintf function (awk) defined, PGM 3-8 4.2BSD improvement, SYS 1-13 stat.h file sptab table defined, PGM 2-68 4.2BSD improvement, Statement (as) SQFILE description, SYS 5-142 description, GEN 6-55 to 6-56 sqrt function (awk) Statement (BC) defined, PGM 3-8 See also specific statements sqrt keyword, GEN 2-44E description, GEN 2-54 to 2-55 typing several on one line, GEN defined, GEN 2-51 sqrt operator (EQN) 2-48 creating square roots, GEN 5-100 Status Square root defined, GEN 4-73 creating with EQN, GEN 5-100 status command (mt) DC and, GEN 2-61 showing state of tape drive, SYS Square root (BC), GEN 2-44 1-7 ss command (troff) stderr file pointer defined, GEN 5-58 description, PGM 1-6 error handling and, PGM 1-7 sscanf function description, PGM 1-8 stdin file pointer SSIZE parameter description, PGM 1-6 description, SYS 5-121 stdio library SSPACE operator (C compiler) 4.2BSD improvement, SYS 1-14 defined, PGM 2-64 stdout file pointer Stack command (DC) description, PGM 1-6 stop command (C shell) description, GEN 2-62 Standalone 1/0 library background jobs and, GEN 4-46E 4.2BSD improvement, SYS 5-15 defined, GEN 4-73 Standard error output file stop command (ex) description, PGM 1-6 Berkeley TTY driver and, GEN Standard 1/0 library 3-102 description, GEN 3-93 call formats, PGM 1-21 to 1-24 defined, PGM 1-5 | description, PGM 1-5 to 1-8, 1-21 to 1-24 Standard input See Input typing form letters or text with nroff/troff, GEN 5-72 Standard input file description, PGM 1-6 Standard output See Output stop command (lpc) description, PGM 4-103 Stopped message suspending jobs and, GEN 4-46 Storage class description, GEN 2-53 store command (DC) See s command (DC) Stream socket See also Datagram socket creating in Internet domain, SYS 3-TE Index-60 SYS 5-7 | Stream socket (Cont.) defined, SYS 3-6 String (C shell) defined, GEN 4-73 String (nroff/troff) defined, GEN 5-62 description, GEN 5-62 to 5-65 substitute command (sed), GEN 3-111E description, GEN 3-110 to 3-111 special characters and, GEN 3-110 Substitution See also Expansion String statement (as) defined, GEN 4-73 defined, GEN 6-56 substr command (M4) description, PGM 2-397 strip 4.2BSD improvement, SYS 1-9 substr function (awk) defined, PGM 3-8 STST file description, SYS 5-143 stterm routine variables set by, PGM 4-89T to Subtraction DC and, GEN 2-60 subwin routine defined, PGM 4-87 4-90T stty command DEC standard values and, SYS Suffix list (make), PGM 3-17 description, PGM 3-21 Summary information 1-9 stty command (C shell) background jobs and, GEN 4-48 defined, GEN 4-73 Style program See also Diction program description, GEN 5-163 to 5-177 contents, SYS 2-8 sup keyword (EQN) specifying superscripts, GEN 5-99 Super user security and, SYS 4-4 Super-block description, SYS 2-8 su 4.2BSD improvement and, SYS specifying, GEN 5-47 1-9 sub keyword (EQN) specifying subscripts, GEN 5-99 subr__mcount.c file contents, SYS 5-9 subr__prf.c file contents, SYS 5-9 subr__rmap.c file contents, Superscript SYS 5-9 subr_xxx.c file contents, SYS 5-9 Subscript specifying, GEN 5-47 Subscript (EQN) specifying, GEN 5-99 Subscript (nroff/troff) specifying, GEN 5-68 Subscript (troff) specifying, GEN 5-87E Subscripted variable defined, GEN 2-46 to 2-47 Substitute command See s command substitute command (edit) See s command (edit) substitute command (ex) Superscript (EQN) specifying, GEN 5-99 Superscript (nroff/troff) specifying, GEN 5-68 Superscript (troff) specifying, GEN 5-87E Suspended job defined, GEN 4-73 description, GEN 4-36 sv command (me) | specifying blank lines, GEN 5-44 sv command (nroff/troff) defined, GEN 5-62 Swap space configuration 4.2BSD improvement, SYS 1-4 swapgeneric.c file 4.2BSD improvement, SYS 5-14 swapon system call - 4.2BSD improvement, SYS 1-13 SWIT operator (C compiler) defined, PGM 2-65 switch command (C shell) defined, GEN 4-73 exiting from, GEN 4-58 forms of, GEN 4-58 See s command (ex) Index-61 systm.h file sx command (me) See also kernel.h file defined, GEN 5-41 4.2BSD improvement, SYS 5-7 Symbolic link sz command (me) description, SYS 1-3, 1-34 changing point size, GEN 5-38W Symbolic link data block defined, GEN 5-44 defined, SYS 2-12 SYMDEF operator (C compiler) defined, PGM 2-64 T symlink system call 4.2BSD improvement, SYS 1-13 t command (ed) Symmetric protocol compared with m command, GEN defined, SYS 3-17 3-51 sys directory file prefixes, creating a series of variable lines, GEN 3-51 t command (ex) See copy command (ex) SYS 5-8T SyS__errno printing, PGM 1-12 sys___generic.c file t command (sed) contents, SYS 5-9 defined, GEN 3-114 sys__inode.c file contents, T command (vi) SYS 5-9 defined, GEN 3-79 sys__machdep.c file 4.2BSD improvement, SYS 5-13 defined, GEN 3-81 sys__process.c file contents, t escape (Mail) SYS 5-9 description, GEN 2-25 sys__socket.c file ~ T flag (Mail) contents, SYS 5-9 syscmd command (M4) defined, GEN 2-36 t flag (make) description, PGM 2-396 defined, PGM 3-17 sysline program maintaining terminal status, SYS 1-9 syslog server program 4.2BSD improvement, SYS 1-21 System function description, PGM 1-12 System identifier defined, SYS 5-74 System mailbox file commands for folders and, GEN 2-23 hold option and, GEN 2-32 incoming mail and, GEN 2-17 mbox and, GEN 2-20 storing mail, GEN 2-20, 2-21 System management best reference, SYS System process defined, PGM 4-5 System time - 4.2BSD improvement, SYS 1-4 System-wide file defined, GEN 2-21 Systems Industries 9700 tape drive See ut.c device driver Index-62 t command (vi) T option (hunt) defined, GEN 5-149 t option (hunt) defined, GEN 5-149 T option (nroff) defined, GEN 5-50 t option (troff) defined, GEN 5-50 ta command (nroff/troff) defined, GEN 5-66 Tab resetting, GEN 5-45 setting multiple, GEN 5-87 Tab character printing, GEN 3-37 terminals without, GEN 2-4 Tab character (nroff/troff) setting, GEN 5-66 uninterpreted, GEN 5-66 Tab replacement character See t¢c command (troff), GEN 5-87 Tab stop setting, GEN 3-61n vi and, GEN 3-61 Technical memorandum Table breaking across pages, GEN 5-10 continuing, GEN 5-35 entering with -ms, GEN 5-8 floating, GEN 5-45 formatting, GEN 2-13, 5-33 keeping on one page, GEN 5-42 text formatting commands for, GEN 5-16E Table of contents entering, GEN 5-28 text formatting commands for, GEN 5-13E Tektronix 4025 terminal command character for, GEN 3-76 Tektronix 4027 terminal command character for, GEN 3-76 telnet program ARPA Telnet protocol and, SYS 1-9 telnetd server program formatting, GEN 5-34F Jlogin file and, SYS 1-7 producing, GEN 5-18, 5-18K 4.2BSD improvement, SYS 1-21 specifying multiple, GEN 5-29 specifying section titles for, GEN term option (ex) description, GEN 3-101 Terminal 5-41 specifying without leadering, GEN See also Hardcopy terminal See also Pseudo terminal 5-29 See also Screen (Screen package) Tables formatting, GEN 5-115 to 5-131 tabstop option (ex) description, GEN 3-100 See also Screen package See also Slow terminal See also Uppercase terminal configuring, SYS 5-42 Tag defined, GEN 5-145 tag command (ex) description, GEN 3-93 programs changing mode of, GEN 4-48 replacing with a file, GEN 2-10 specifying output type with nroff, Tag file defined, GEN 5-145 taglength option (ex) description, GEN 3-100 tags option (ex) GEN 5-50 specifying standard output with troff, GEN 5-50 specifying type, GEN 3-54E 3.5 changes, GEN 3-103 strange behavior, GEN 2-4 description, GEN 3-100 supported reference list, GEN 2-3 tail 4.2BSD improvement, SYS 1-9 type codes, GEN 3-53T talk program description, switch settings, GEN 2-3 SYS 1-9 tar program 4.2BSD improvement, SYS 1-9, 1-17 tbl program description, GEN 5-33, 5-115 to 5-131 formatting tables, GEN 2-13 tc command (nroff/troff) defined, GEN 5-66 tc command (troff) replacing tab character, GEN 5-87 TCP program See trpt program teachgammon program 4.2BSD improvement, SYS 1-17 without tabs, GEN 2-4 Terminal screen defined, PGM 4-75 Termination defined, GEN 4-73 terse option (ex) description, GEN 3-101 test command Bourne shell and, GEN 4-12 Text editor See ed editor defined, GEN 3-3, 3-25 See also Edit editor, GEN 3-3 Text Formatting See also nroff/troff text processor Text input mode (ex) defined, GEN 3-85 Index-63 Text segment (as) description, GEN 6-54 timezone parameter (config) defined, SYS 5-79 tip program text statement defined, GEN 6-59 tftpd server program 4.2BSD improvement, SYS 1-21 TH command (me) continuing a table, GEN 5-35E th command (me) defined, GEN 5-45 formatting a thesis, GEN 5-33 then command (C shell) See also else command (C shell) See also if/fendif commands (C cu program as front end, SYS 1-5 description, SYS 1-4, 1-9 Title page formatting informal, GEN 5-46 specifying, GEN 5-32, 5-45 TL command (ms) AE command and, GEN 5-6 tl command (nroff/troff) defined, GEN 5-70 tl command (troff) printing page numbers, GEN 5-91E shell) defined, GEN 4-73 Thesis tm command (nroff/troff) defined, GEN 5-73 formatting, GEN 5-18, 5-33, 5-45 - text formatting commands for, GEN 5-13E TMTM file description, SYS 5-142 TM macro description, GEN 5-18 Thompson, K. UNIX implementation, PGM 4-5 tm.c device driver 4.2BSD improvement, SYS 5-12 to 4-14 Thompson, K., & Morris, R. password system, SYS 4-7 to 4-12 Thompson, K., & Ritchie, D.M. implementation of file system and to keyword (EQN), GEN 5-100E Token defined, GEN 2-50 top command (Mail) user command interface, GEN See also toplines option 1-19 to 1-34 abbreviating, GEN 2-32 ti command (me) entering, GEN 5-24 ti command (nroff/troff) defined, GEN 5-62 ems and, GEN 5-86 Tilde character (C shell) accessing files from other directories, GEN 4-34 description, GEN 2-32 toplines option (Mail) defined, GEN 2-35 setting, GEN 2-32E topg command (Ipc) description, PGM 4-103 touchwin routine defined, PGM 4-87 Tilde character (me) Toy, M.C., & Arnold, K.C.R.C. See Metacharacters guide to the dungeons of doom, Tilde escape (Mail) defined, GEN 2-24 description, GEN 2-24 to 2-26 GEN 6-17 to 6-25 tp command (me) defined, GEN 5-45 lines beginning with, GEN 2-26 specifying a title page, GEN 5-32 printing summary of, GEN 2-26 specifying title page, GEN 5-33E reference list, GEN 2-40T time command (C shell) defined, GEN 4-74 timing a command, GEN 4-52E time.h file timeout option (ex) description, GEN 3-102 TIMEZONE parameter Index-64 defined, GEN 2-13, 5-67 using, GEN 2-13E transfer command See t command (ed) 4.2BSD improvement, SYS 5-7 description, tr command (nroff/troff) SYS 5-122 translit command (M4) description, PGM 2-397 Transparent throughput (nroff/troff) specifying, GEN 5-67 TS command (me) Trap description, GEN 1-31 trap command (Bourne shell) fault handling, GEN 4-21 to 4-23 trap.c file 4.2BSD improvement, SYS 5-14 trek game 4.2BSD improvement, SYS 1-17 troff text processor See also EQN program See also ms macro package See also nroff text processor See also nroff/troff text processor See also tbl program defined, GEN 2-12, 5-83 defining macros, GEN 5-89 to 5-90 defining strings, GEN 5-88, 5-89 device resolution and, GEN 5-56 drawing horizontal and vertical lines of characters, GEN 5-88 entering arithmetic expressions, GEN 5-92 ‘continuing tables, GEN 5-35 defined, GEN 5-45 formatting tables, GEN 5-35 ts driver 4.2BSD improvement, SYS 1-16 ts.c device driver 4.2BSD improvement, SYS 5-13 tset command (C shell) defined, GEN 4-74 using, GEN 4-30E tstp routine defined, PGM 4-88 tty See also ttydev.h file handling, SYS 5-6 tty character See also ttychars.h file handling, SYS 5-5 tty command (C shell) defined, GEN 4-74 tty.c file 4.2BSD improvement, SYS 5-9 entering commands, GEN 5-83 tty.h file environments, GEN 5-94 formatting a document with -ms, tty__bk.c file GEN 2-12 indenting lines, GEN 5-86 invoking, GEN 5-49 moving characters up and down, GEN 5-87 moving text backwards on a line, GEN 5-87 4.2BSD improvement, SYS 5-7 obsolete, SYS 5-9 tty__conf.c file contents, SYS 5-9 tty__pty.c file 4.2BSD improvement, SYS 5-9 tty__subr.c file contents, SYS 5-9 setting point sizes, GEN 5-84 tty__tb.c file setting tabs, GEN 5-86 setting vertical spacing, GEN 5-84 tty__tty.c file specifying cut mark, GEN 5-74K specifying fonts, GEN 5-85 ttychars.h file specifying fonts on the typesetter, GEN 5-86 specifying metacharacters, GEN 5-86 specifying page heading, GEN 5-90 specifying unpaddable characters, GEN 5-88 stopping phototypesetter to reload, GEN 5-49 tutorial, GEN 5-83 to 5-96 trpt program 4.2BSD improvement, SYS 1-21 truncate system call contents, SYS 5-9 contents, SYS 5-9 4.2BSD improvement, SYS 5-5 ttydev.h file 4.2BSD improvement, SYS 5-6 tu driver 4.2BSD improvement, SYS 1-16 tu.c file 4.2BSD improvement, SYS 5-14 TUS58 cartridge tape cassette See uu driver See uu.c device driver TUS8O tape drive See ts driver tunefs program 4.2BSD improvement, SYS 1-21 4.2BSD improvement, SYS 1-13 Index-65 uba__ctrl structure Tuthill, B. -ms revised version, GEN 5-17 to using refer, GEN 5-133 to 5-142 Twinkle program description, PGM 4-92E motion optimization and, PGM description, SYS 5-94 uba__driver structure description, SYS 5-90 ud__addr routine description, SYS 5-93 4-97E Two-column output ud__attach routine description, SYS 5-92 See Column type command (Mail) See print command (Mail) abbreviating, GEN 2-18 description, GEN 2-32 reading mail and, GEN 2-18 to ud__dgo routine description, SYS 5-93 ud__dinfo routine description, SYS 5-93 ud_dname routine description, SYS 5-93 2-19 Type-number (refer) reference list, GEN 5-152 Typesetting Mathematics - User’s Guide, GEN 5-105 to 5-114 ud_minfo routine description, SYS 5-93 ud_mname routine description, SYS 5-93 ud__probe routine Typing correcting mistakes, GEN 2-4 Typo description, SYS 5-93 uba__device structure 5-19 | defined, GEN 2-13 detecting spelling errors, GEN 2-13 U u command (ed) using, GEN 3-38 u command (edit) description, SYS 5-91 ud__slave routine description, SYS 5-91 ud__xclu routine description, SYS 5-93 uda driver 4.2BSD improvement, SYS 1-16 uda.c device driver 4.2BSD improvement, SYS 5-13 uf command (nroff/troff) defined, GEN 5-67 ufs__alloc.c file See also At sign contents, See also CTRL-H ufs__bio.c file description, GEN 3-16 recovering files, GEN 3-23 u command (ex) description, GEN 3-93 u command (me) defined, GEN 5-44 u command (troff) specifying superscripts and subscripts, GEN 5-87 U command (vi) defined, GEN 3-79 u command (vi) defined, GEN 3-81 u flag (Mail) defined, GEN 2-36 u option (uulog) defined, SYS 5-137 uba.c device driver 4.2BSD improvement, SYS 5-13 Index-66 SYS 5-9 contents, SYS 5-10 ufs__bmap.c file contents, SYS 5-10 ufs__dsort.c file | contents, SYS 5-10 ufs__fio.c file contents, SYS 5-10 ufs__inode.c file contents, SYS 5-10 ufs__machdep.c file 4.2BSD improvement, SYS 5-13 ufs_mount.c file contents, SYS 5-10 ufs_nami.c file contents, SYS 5-10 ufs__subr.c file contents, SYS 5-10 ufs__syscalls.c file contents, SYS 5-10 ufs__tables.c file contents, SYS 5-10 ufs__xxx.c file contents, SYS 5-10 uh command (me) defined, GEN 5-41 specifying unnumbered section heads, GEN 5-32E ui__addr routine description, SYS 5-95 ui__alive routine description, SYS 5-95 ui__ctir routine description, SYS 5-94 ui__dk routine description, SYS 5-95 ui__driver routine description, SYS 5-94 ui__flags routine description, SYS 5-95 ui__hd routine description, SYS 5-95 ui__intr routine description, SYS 5-95 ui__mi routine description, SYS 5-95 ui__physaddr routine description, SYS 5-95 ui__slave routine description, SYS 5-94 ui___type routine description, SYS 5-95 ui__ubanum routine description, SYS 5-94 ui__unit routine description, SYS 5-94 description, GEN 1-22, SYS 4-4 4.2BSD improvement, SYS 1-9 ul command (me) See also u command (me) entering, GEN 5-25 troff and, GEN 5-36 UL command (ms) underlining a word, GEN 5-8 ul command (nroff/troff) defined, GEN 5-67 ul command (troff) specifying italic lines, GEN 5-86 ULTRIX-32 See also UNIX ULTRIX-32 Operating System getting started, GEN 2-1 to 2-64 um__cmd routine description, SYS 5-94 um__ctrl routine description, SYS 5-94 um__driver routine description, SYS 5-94 um__hd routine description, SYS 5-94 um__intr routine description, SYS 5-94 um__tab routine description, SYS 5-94 um__ubinfo routine description, SYS 5-94 Umlat See Metacharacters un network interface driver 4.2BSD improvement, SYS 1-16 4.2BSD improvement, SYS 5-6 una command (ex) uio.h file 4.2BSD improvement, SYS 5-6 uipc__domain.c file contents, SYS 5-10 uipc__mbuf.c file SYS 5-10 uipc__pipe.c file contents, SYS 5-10 uipc__proto.c file contents, SYS 5-10 uipc__socket.c file contents, SYS 5-10 uipc__socket2.c file contents, contents, SYS 5-10 ul command un.h file UID contents, uipc__usrreq.c file SYS 5-10 uipc__syscalls.c file contents, SYS 5-10 See also abcommand (ex) description, GEN 3-93 unabbreviate command (ex) See una command (ex) unalias command (C shell) See also alias command (C shell) defined, GEN 4-74 Unary operator defined, GEN 2-52 Unary operator (C compiler) description, PGM 2-66 unctrl routine defined, PGM 4-87 undelete command (Mail) See also delete command (Mail) Index-67 undelete command (Mail) (Cont.) UNIX Operating System (Cont.) abbreviating, GEN 2-33 introduction, GEN 1-19 to 1-20 description, GEN 2-33 managing Underlining See SYS See also Italic nroff and, GEN 5-66 other operating systems and, PGM 4-13 on the typesetter, GEN 5-8 programming, PGM 1-3 to 1-24 specifying, GEN 5-8, 5-25 reading list, GEN 2-15 technique for, GEN 3-42 Undo command See u command undo command (edit) See u command (edit) undo command (ex) See u command (ex) Ungermann-Bass network interface unit software environment, GEN 1-20 UNIX Programmer’s Manual accessing on line, GEN 2-5 UNIX/32V Operating System hardware requirements, GEN 1-4 highlights, GEN 1-3 to 1-18 recreating, SYS 5-119 regenerating system software, SYS 5-117 to 5-122 See un network interface driver ungetc function description, PGM 1-8 setting up V1.0, SYS 5-107 to 5-115 tuning, SYS 5-121 to 5-122 UNIX/32V Programmer’s Manual UNIBUS device naming, SYS 5-20 UNIBUS device driver support routines, SYS 5-95 univec.c file online, GEN 1-11 unlink function description, PGM 1-11 unlink system call installing device driver and, SYS See mkdir command unmap command (ex) 5-119 UNIX Assembler Reference Manual, See also map command (ex) GEN 6-53 to 6-64 description, GEN 3-93 See also as assembler unoptim routine (C shell) UNIX Operating System See also 4.2BSD See also ULTRIX-32 See also VAX UNIX system bootstrapping and 4.2BSD, SYS (nroff/troff) defined, GEN 5-60, 5-88 specifying for spaces, GEN 5-88 unpcb.h file 5-78 building with config, SYS 5-73 to 4.2BSD improvement, SYS 5-6 unset command (C shell) 5-105 changes in 4.2BSD, SYS 1-3 to defined, GEN 4-74 unset command (Mail) 1-21 computer-aided instruction for, GEN 6-3 to 6-16 crashing, SYS 4-3 defined, GEN 3-3 design considerations, GEN 1-31 device naming, SYS 5-19 distinguishing block and raw SYS 5-20 for beginners, GEN 2-3 to 2-16 getting started, GEN 6-15 to 6-16 hardware environment, GEN 1-20 implementation, PGM 4-5 to 4-14 Index-68 description, PGM 2-67 to 2-68 Unpaddable space character specifying for digits, GEN 5-88 5-15 building process, SYS 5-76 to devices, See also optim routine (C shell) See also set command (Mail) description, GEN 2-33 until statement (C shell) See also while statement (C shell) description, GEN 4-13 up driver 4.2BSD improvement, SYS 1-16 up.c device driver 4.2BSD improvement, SYS 5-13 Uppercase terminal vi and User ID uucp system (Cont.) See UID | administration, SYS 5-142 to User Identification Number 5-144 See UID defined, SYS 5-131 User identification number directory list, SYS 5-45 See UID file list, User process defined, PGM 4-5 installing, SYS 5-138 to 5-142 user.h file login entry and, SYS 5-144 security and, SYS 5-138 4.2BSD improvement, SYS 5-7 USERFILE setting up, SYS 5-45 to 5-46 defined, SYS 5-140 uucp.h file USR directory block size, SYS 5-45 to 5-46 implementing, SYS 5-131 to 5-144 modifying for uucp, SYS 5-138 SYS 5-40 uulog program description, GEN 2-9 defined, SYS 5-131 rebuilding, SYS 5-32 description, SYS 5-137 setting up, SYS 5-28 uusnap program ut.c device driver description, SYS 1-9 4.2BSD improvement, SYS 5-12 uux command command line format, SYS 5-133 utime system call See utimes system call defined, SYS 5-125 utimes system call description, SYS 5-133 to 5-134 4.2BSD improvement, SYS 1-13 providing remote output, SYS utmp file 5-127 See also wtmp file uux program 4.2BSD improvement, SYS 1-17 uu driver defined, SYS 5-131 uuxqt program 4.2BSD improvement, SYS 1-16 uu.c device driver defined, SYS 5-131 description, SYS 5-137 4.2BSD improvement, SYS 5-12 uucico program \Y defined, SYS 5-131 description, SYS 5-124, 5-134 to v command (DC) 5-137 functions, SYS 5-125 descripton, GEN 2-58 starting, SYS 5-125, 5-134 starting with shell file, SYS 5-143 uuclean program defined, SYS 5-131 description, SYS 5-137 uucp command command line format, SYS 5-131 defined, SYS 5-125 description, SYS 5-131 to 5-133 transferring files between machines, SYS 5-132E UUCP network ARPANET and, GEN 2-26 uucp program defined, SYS 5-131 uucp system 4.2BSD improvement, SYS 1-4, 1-9, 5-45 v command (ed) defined, GEN 3-34 specifying line numbers, GEN 3-47 specifying lines without text patterns, GEN 3-46 to 3-47 using, GEN 3-33 v command (troff) creating decorative initial capital, GEN 5-87E moving characters up and down, GEN 5-8T7 specifying vertical motion, GEN 5-68 v escape (Mail) description, GEN 2-24 v flag (Mail) See also verbose option defined, GEN 2-36 Index-69 VAX/VMS Operating System v option (inv) autoconfiguration, SYS 5-89 to defined, GEN 5-148 5-95 va driver 4.2BSD improvement, SYS 1-16 data structure sizing rules, SYS 5-103 to 5-1056 va.c file 4.2BSD improvement, SYS 5-13 Valued option (Mail) VAX/VMS system sources SYS 5-4 directory list, ve command (ex) See also Option (Mail) description, GEN 3-94 defined, GEN 2-20 Variable (BC) declaring automatic, GEN 2-46 number permitted, GEN 2-45 verbose option (Mail) See also —v flag defined, GEN 2-35 verbose variable (C shell) Variable (Bourne shell) description, GEN 4-10 to 4-12 defined, GEN 4-74 Version reference list, GEN 4-11 suppressing for Mail, GEN 2-35 Variable (C shell) accessing components, GEN 4-54 checking for assigned value, GEN version command (ex) See ve command ex) Vertical bar (EQN) 4-53 typesetting in proper size, GEN defined, GEN 4-74 removing definition from shell, 5-100E Vertical spacing GEN 4-52 removing from environment, GEN 4-52 Variable (Screen package) setting with troff, GEN 5-84 Vesterman, W., & Cherry, L.L. style and diction programs, GEN 5-163 to 5-177 reference list, PGM 4-77 vfontinfo program Variable expansion font information and, SYS 1-9 See Expansion vfork system call See Variable future plans, SYS 1-13 Variable substitution vgrind description, GEN 4-53 4.2BSD improvement, SYS 1-9 VAX UNIX system vgrindefs file accounting, SYS 5-56 4.2BSD improvement, SYS 1-17 booting, SYS 5-52 booting for single user, SYS 5-52 changing from single user to multiuser status, SYS 5-52 changing to multiuser from single user status, SYS 5-52 checking file system, SYS 5-53 vi command (ex) 3.5 changes, GEN 3-102 description, GEN 3-94 screen editing and, GEN 3-85 vi screen editor file maintenance list, SYS 5-57 4.2BSD improvement, SYS 1-9 monitoring system performance, changing words, GEN 3-60 SYS 5-54 operating procedures, SYS 5-52 regenerating, SYS 5-55 resource control, SYS 5-56 tracking changes, SYS 5-56 VAX-11/750 configuration file, SYS 5-85 VAX-11/750 console cassette interface Index-70 3-61 character functions, GEN 3-75T characters for making corrections in input mode, GEN 3-72T commands for file manipulation, GEN 3-71T description, GEN 3-53 to 3-82 VAX-11/780 configuration file, character editing, GEN 3-59 character editing, low level, GEN deleting lines, GEN 3-60 deleting words, GEN 3-59 See tu driver SYS 5-84 | See also open option vi screen editor (Cont.) determining state of file, GEN 3-57 editing programs, GEN 3-67 ending a session, GEN 3-55 ex 3.5 changes and, GEN 3-103 to 3-104 ex and, GEN 3-73 executing shell command from, GEN 3-63 ignoring case, GEN 3-72 inserting text, GEN 3-58 invoking, GEN 3-54E line editing, GEN 3-60 manipulating files, GEN 3-70 marking return points, GEN 3-64 moving blocks of text, GEN 3-62 moving in the file, GEN 3-56 to 3-58 moving on the screen, GEN 3-57 moving to previous position, GEN 3-57 moving within a line, GEN 3-57 option list, GEN 3-65 presenting lines, GEN 3-69 recovering lost files, GEN 3-66 recovering lost lines, GEN 3-66 reversing your changes, GEN 3-60 saving changes automatically, GEN 3-63 searching for strings in text, GEN 3-56, 3-T1 sentences and, GEN 3-61 view command (ex) description, GEN 3-102 view command (vi) reading a file, GEN 3-58 vipw program 4.2BSD improvement, SYS 1-21 vipw script vm__machdep.c file 4.2BSD improvement, SYS 5-13 vm__mem.c file contents, SYS 5-11 vm__mon.c file contents, SYS 5-11 vm__page.c file vim__proc.c file contents, SYS 5-11 vim__pt.c file contents, SYS 5-11 vim__sched.c file contents, SYS 5-11 vm__subr.c file contents, SYS 5-11 vm__sw.c file contents, SYS 5-11 vin_swap.c file contents, SYS 5-11 vin__swp.c file contents, SYS 5-11 vm__text.c file contents, SYS 5-11 vmmac.h file 4.2BSD improvement, SYS 5-7 vmparam.h file 4.2BSD improvement, SYS 5-7, 5-13 vmstat program 4.2BSD improvement, SYS 1-9 monitoring system activity, SYS 5-54 vmsystm.h file 4.2BSD improvement, SYS 5-7 vpr program shell scripts and, SYS 1-10 vread system call obsolete, SYS 1-13 vs command (nroff/troff) See vipw program defined, GEN 5-61 visual command (ex) setting, GEN 5-84 See vi command (ex) visual command (Mail) See also edit command (Mail) description, GEN 2-33 VISUAL option (Mail) defined, GEN 2-33 setting, GEN 2-33 specifying an editor, GEN 2-24 vlimit system call See getrlimit system call | 4.2BSD improvement, SYS 5-11 vswapon system call See swapon system call vtimes system call See getrusage system call vv network interface driver 4.2BSD improvement, SYS 1-16 vwidth program troff width tables and, SYS 1-10 vwrite system call obsolete, SYS 1-13 vlp program printing lisp programs, SYS 1-9 Index-71 W Weinberger, P.J., & Feldman, S.I. Fortran 77 compiler, PGM 2-89 to 2-109 w command (ed) defined, GEN 3-34 Weinberger, P.J., & others awk programming language, PGM 3-5 to 3-12 e command and, GEN 3--27 entering text into a file, GEN 2-6 saving lines for input, GEN 3-50 using, GEN 3-26 w command (edit) description, GEN 3-22 u command and, GEN 3-16 using, GEN 3-8 w command (ex) See also wq command (ex) wh command (nroff/troff) defined, GEN 5-65 whereis 4.2BSD improvement, SYS 1-10 which 4.2BSD improvement, SYS 1-10 while statement (awk) defined, PGM 3-9 while statement (BC), GEN 2-47 description, GEN 3-94 forming, GEN 2-54 w command (nroff/troff) writing, GEN 2-47 description, GEN 5-68 w command (sed) defined, GEN 3-111 W command (vi) defined, GEN 3-80 w command (vi) defined, GEN 3-81 w escape (Mail) description, GEN 2-24 w flag (mkey) specifying a file, GEN 5-147 w flag (sed) defined, GEN 3-110 w option (troff) defined, GEN 5-50 wait function description, PGM 1-14 wait system call See also wait.h file 4.2BSD improvement, SYS 1-14 wait.h file 4,2BSD improvement, SYS 5-6 waitd system call See also wait.h file 4.2BSD improvement, SYS 1-14 warn option (ex) description, GEN 3-101 Wasley, D.L. introduction to 77 I/O library, PGM 2-79 to 2-88 wc command (C shell) 4.2 BSD improvements, SYS 1-10 defined, GEN 2-13, 4-74 printing a list of files and, GEN 2-11 WDATA operator (C compiler) defined, PGM 2-64 Index-72 while statement (C shell) See also until statement (C shell) defined, GEN 4-74 description, GEN 4-12 to 4-13 exiting, GEN 4-58 form of, GEN 4-12E forms of, GEN 4-58 who command 4.2BSD improvement, SYS 1-10 printing list of people logged on, GEN 2-11E using, GEN 2-4 Width command (nroff/troff) See w command (nroff/troff) winch routine defined, PGM 4-86 Window defined, PGM 4-75 description, PGM 4-76 moving, GEN 2-33 window option (ex) description, GEN 3-101 window option (Mail) headers command and, GEN 2-30 WINDOW structure defined, PGM 4-91E description, PGM 4-76 Word (C shell) defined, GEN 4-74 Word (nroff/troff) defined, GEN 5-60 Word abbreviation See also Macro (vi) description, GEN 3-69 Word list specifyving for hyphenation, GEN rl\/\/‘ 5-69 Work file X option (uucico) defined, SYS 5-132 Working directory defined, SYS 5-135 x option (uuclean) changing, GEN 4-48 changing background job to defined, SYS 5-138 x option (uucp) foreground job and, GEN 4-50 changing with programs, GEN defined, SYS 5-132 X option (uux) 4-50 defined, GEN 4-74 description, SYS 5-133 Xerox Courier protocol description, GEN 4-48 to 4-50 wq command (ex) description, See also xit command (ex) controller description, GEN 3-94 wrapmargin option (ex) See en network interface driver Xerox NS Sequenced Packet 3.5 changes, GEN 3-102 protocol description, GEN 3-101 sequenced packet socket and, SYS wrapscan option (ex) description, GEN 3-101 3-6 Xerox Routing Information Protocol write command (C shell) defined, GEN 4-74 See routed program xit command (ex) write command (ed) See also wq command (ex) See w command (ed) write command (edit) description, GEN 3-94 x] command (me) See w command (edit) write command (ex) defined, GEN 5-45 xp command (me) See w command (ex) write command (Mail) defined, GEN 5-43 XP macro See also save command (Mail) description, GEN 2-33 description, GEN 5-18 XS macro write function description, PGM 1-9 SYS 3-17 Xerox experimental Ethernet description, GEN 5-18 xtr script file write system call running, SYS 5-26E 4.2BSD improvement, SYS 1-14 writeany option (ex) Y description, GEN 3-101 writev system call Y command (vi) 4.2BSD improvement, SYS 1-14 defined, GEN 3-80 wtitmp file See also utmp file using, GEN 3-62 y operator 4.2BSD improvement, SYS 1-17 See also Y command (vi) moving blocks of text, GEN 3-62 X ya command (ex) x command (Mail) Yacc description, GEN 3-95 exiting Mail, GEN 2-22 See also Lex program generator x command (me) defined, GEN 5-43 description, PGM 3-79 to 3-111 yank command (ex) - entering, GEN 5-29 See ya command (ex) X command (sed) defined, GEN 3-113 X command (vi) Z defined, GEN 3-80 x command (vi) defined, GEN 3-81 z command (DC) description, GEN 2-59 Index-73 z. command (nroff/troff) (Cont.) z command (edit) description, GEN 5-68 printing a screen of text, GEN z command (vi) 3-12, 3-13E defined, GEN 3-81 z command (ex) positioning screen text, GEN 3-64 description, GEN 3-95 z option (nroff/troff) z command (Mail) defined, GEN 5-81 description, GEN 2-33 Zero z command (me) as legal line number, GEN 3-46 defined, GEN 5-42 entering, GEN 5-26 | specifying fill mode, GEN 5-26 z command (nroff/troff) creating overstruck characters, GEN 5-88 Index-74 Z7 command (vi) defined, GEN 3-80 description, GEN 3-55 Notes: Notes: Notes: Notes: Notes: Notes: Notes: Notes:
Home
Privacy and Data
Site structure and layout ©2025 Majenko Technologies