Chapter B17: Text processing in Linux
 
Goals for this chapter: rpm packages covered in this chapter: 
  • ispell (ispell)
  • netscape-communicator (netscape)
  • sgml-tools (sgml2html, sgml2info, sgml2latex, sgml2lyx, sgml2rtf, sgml2txt)
  • kdegraphics (kpaint)
  • words (/usr/dict/linux.words)
  • ghostscript (ps2pdf, gs)
  • man  (man)
    • man-pages
  • groff (groff, eqn)
    • groff-gxditview
  • teTeX (tex)
    • tetex-afm
    • tetex-doc
    • tetex-dvilj
    • tetex-dvips
    • tetex-latex
    • tetex-xdvi
  • texinfo  (makeinfo)
  • transfig (pic2tpic)
  • flex (flex)
  • gnuplot (gnuplot)
  • xfig (xfig)
  • xpdf (xpdf)
  • fmlinux.tar.gz (Adobe FrameMaker Beta)
 

Troff and its macro

Text Processing in UNIX starts with the manual pages about the operating system, the "man" pages, written in the "troff" language.  The"troff" language includes primitives for creating manual pages.

This is the manual page for the "arch" command:

.\" arch.1 --
.\" Copyright 1993 Rickard E. Faith (faith@cs.unc.edu)
.\" Public domain: may be freely distributed.
.TH ARCH 1 "4 July 1997" "Linux 2.0" "Linux Programmer's Manual"
.SH NAME
arch \- print machine architecture
.SH SYNOPSIS
.B arch
.SH DESCRIPTION
.B arch
is equivalent to
.B uname -m

On current Linux systems,
.B arch
prints things such as "i386" or "i486".
.SH SEE ALSO
.BR uname (1) ", " uname (2)

(To see this code, we used Midnight Commander, "mc" - "cd /usr/man/man1", "arch.1.gz", "F4")

The page looks as follows :

In this case we used the program "xman", but we could also have used "man", "mc" (press F3 on the manual page), "tkman" or "hman".

We note that this page includes the primitives, the nomenclature or special words for writing manual phrases like: ".TH" (Title Heading ), ".SH" (Section Heading), ".P" (New Paragraph).

A standard manual page has the following model:

.TH COMMAND (section number)
.SH NAME
command \- brief description of function.
.SH SYNOPSIS
.B command
options
.SH DESCRIPTION
Detailed explanation of programs and options.
Paragraphs are introduced by .P
.P
This is a new paragraph
.SH FILES
Files are used by the command, e.g. passwd (1) mentions /etc/passwd
.SH "SEE ALSO"
References to related documents, including other manual pages
.SH DIAGNOSTICS
Description of any unusual output (.e.g. see cmp (1))
.SH BUGS
Surprising features (not always bugs; see below)

In Linux, troff is replaced by the GNU version: "groff". There is also an evolved version of troff, "nroff", included in Linux as an additional program.

"groff" macros are included in the directory "/usr/lib/groff/tmac":

[root@heaven tmac]# pwd
/usr/lib/groff/tmac
[root@heaven tmac]# ls tmac.*
tmac.X         tmac.doc.old   tmac.gs        tmac.pic       tmac.pspic
tmac.Xps       tmac.dvi       tmac.latin1    tmac.ps        tmac.s
tmac.an        tmac.e         tmac.lj4       tmac.psatk     tmac.safer
tmac.andoc     tmac.gm        tmac.m         tmac.psnew     tmac.tty
tmac.doc       tmac.gmse      tmac.mse       tmac.psold     tmac.tty-char
[root@heaven tmac]#

All the macros for groff (the GNU version of troff) are in this directory:

The UNIX Text Processing includes also these packages: These programs were developed as an extension to the "troff" program and can be executed in a single command :
 

pic file | tbl | eqn | troff | spooler

These packages continue to be included in UNIX (even after 30 years) and now in Linux. However, today in Linux with StarOffice or Corel WordPerfect, it's possible to create a table or write a mathematical formula simply and quickly.

TeX and its evolution

In 1979 Dr. Donald Knuth started the development of TeX for writing technical and scientific documents, but also for writing manuals, letters, books or simple memos. Some years later, Mrs. Leslie Lamport developed the LaTeX macro package that lets you create TeX documents in a more simple and convenient way.

Another TeX evolution is AMS-TeX, which lets you write mathematical formulas that conform to the standards of the American Mathematical Society AMS; and PiCTeX, for drawing geometrical figures and graphs, starting with special TeX macros. The book "Maximum RPM" included in this course was written in LaTeX.

In RedHat Linux, "tex" is included in the package "tetex-0.9-17", installed in the directory "/usr/bin/".

In this section we will present an introduction to these programs. The sources of the examples that we will introduce here in our course are available in the directory "FTContribs/Files/".

TeX works with several primitives like "troff" (or groff), but in a simpler way. There are about 3000 TeX primitives for different purposes: international accents, accurate spacing, for mathematical formulas like matrix, series, integrals and other operators.

A classical TeX example is the file "story.tex" :

[root@heaven TeX]# more story.tex
\hrule
\vskip 1in
\centerline{\bf A SHORT STORY}
\vskip 6pt
\centerline{\sl    by A. U. Thor} % !`?`?! (modified)
\vskip .5cm
Once upon a time, in a distant
  galaxy called \"O\"o\c c,
there lived a computer
named R.~J. Drofnats.

Mr.~Drofnats---or ``R. J.,'' as
he preferred to be called---% error has been fixed!
was happiest when he was at work
typesetting beautiful documents.
\vskip 1in
\hrule
\vfill\eject
\bye
[root@heaven TeX]#

This file includes some special TeX primitives :

the rest is simple text!

To "compile" this file we have to run the command:

[root@heaven TeX]# tex story
This is TeX, Version 3.14159 (C version 6.1)
(story.tex
Babel <v3.6h and hyphenation patterns for american, german, loaded.
[1] )
Output written on story.dvi (1 page, 668 bytes).
Transcript written on story.log.
[root@heaven TeX]#

Don't worry about some font processing !

This "compilation" builds a DVI file that can be printed, displayed or distributed. DVI means Device Independent; that is, it doesn't depend on the device and can thus be printed equally well on any device: laser printer, postscript printer, standard printers.

To display this file you can run the command :

[root@heaven TeX]# xdvi story
 
 



To print this file you simply run :

[root@heaven TeX]# dvips story
This is dvipsk 5.58f Copyright 1986, 1994 Radical Eye Software
' TeX output 1998.11.20:1728' - |lpr
<tex.pro. [1]
[root@heaven TeX]#

that sends the output (in postscript) directly to the printer.

To write a file in TeX it's necessary to remember to add the characters "\bye" at the end of the file. Also a simple ASCII file included in RedHat Linux can be a TeX file without primitives :

[root@redhead /root]# cp /usr/doc/tree-1.2/README .
[root@redhead /root]# tex README
This is TeX, Version 3.14159 (Web2C 7.3)
(README)
*\bye
[1]
Output written on README.dvi (1 page, 1152 bytes).
Transcript written on README.log.
[root@redhead /root]#
[root@heaven TeX]# xdvi !$

The only inconvenience is that TeX can interpret a character like, for example, the "&" as a comment or other. This problem can be resolved by adding the character "\" before the characters like "&" that create error messages.

Writing mathematical formulas with the sum or the integral is a simple task.

The primitive :

$$
\sum_{n=1}^m
$$

lets you write the sum symbol. The symbol for the integral is "\int".

TeX is called "plain" TeX; that is, minimal TeX.

With TeX it's possible, for example, to write formulas like the following :
 
 

or write TeX to create a triangle, square or circles using simple TeX commands for the alignment or skip line: "\leftskip".

This is a memorable example about what we propose in Appendix T :

The set of TeX macros, called "LaTex", starts its documents by including the document style, for example, "article" when we write a scientific pubblication :

[root@heaven LaTeX]# more 199506-81-001.tex

\documentstyle[12pt]{article}
\addtolength{\evensidemargin}{-0.125\textwidth}
\addtolength{\oddsidemargin}{-0.125\textwidth}
\addtolength{\textwidth}{0.21\textwidth}
\begin{document}
\title{Stability of Matter in Magnetic Fields}
\author{Elliott~H.~Lieb$^{1,2}$, Michael Loss$^3$ and Jan Philip Solovej$^2$\\
\footnotesize \it $^1$Department of Physics, Jadwin Hall, Princeton University,
P.~O.~Box 708, Princeton, New Jersey 08544\\ \footnotesize \it
$^2$Department of Mathematics, Fine Hall, Princeton University,
Princeton, New Jersey 08544\\ \footnotesize \it
$^3$School of Mathematics, Georgia Institute of Technology, Atlanta,
Georgia 30332}
\date{April 11 (revised June 11)}
\maketitle

\begin{abstract}
In the presence of arbitrarily large magnetic fields, matter
composed of electrons and nuclei was known to be unstable if $\alpha$
or $Z$ is too large. ...

For "compiling" this "LaTeX" file we have to run the command (both TeX and LaTeX files have the same extension ".tex") :

[root@heaven LaTeX]# latex 199506-81-001

The generated file is a DVI file and therefore can be visualized or printed with the normal methods.

In the dir "/usr/share/texmf/tex/latex/amslatex/" there are present the styles ".sty", the definition files ".def" and other important LaTeX files.

LaTeX also implements the "hyphenation" of the document's language, respecting the syllable divisions and automatically having a correct sense when the text needs a line feed. To have this special function you have to load the appropriate hyphenation file for Italian, Spanish or French. Hyphenation for English and German are included by default.

To load this file you only have to include the TeX command "\input".

The set of macros for AMS-TeX and PiCTeX in this mode is included :

[root@heaven AMS-TeX]# more 1.tex
% Example oversetbrace with AMS-TeX
%
\input amstex
%\documentstyle{amsppt}
\document
$$\oversetbrace \text{$k$ time} \to {x+\dots+x}$$
\enddocument
\bye
[root@heaven AMS-TeX]#

This function is very useful for including chapters in a book so we can separately modify each chapter that will be included in the final work.

PiCTeX lets you build graphical figures like the following with simple TeX commands :

For learning TeX, LaTeX, AMS-TeX or PiCTeX, in detail please see the bibliography.
 
 

"lyx" or WYSIWYG for [La]TeX

A few years ago a group of people organized by Matthias Ettrich (the same coordinator for KDE) started to develop a program for writing formulas and displaying them, for TeX and LaTeX.

In this section we will show some examples.

The "klyx" program is a useful tool for users that don't know the TeX primitives and want to have good results in a short time.

To use "klyx"1 just run it from the command line:

[root@heaven /root]# klyx

Once in "klyx" it's necessary to build a new file for your work.

To insert a formula with the sum, you have to choose "Math" and afterwards "Sum" in the pop-up menus. Afterwards by pressing the keys  "_", "^" and "Enter" it's possible to obtain output as in the figure :
 
 

With "lyx" we can write entire mathematical books that use [La]TeX.

Plot of mathematical figures

[Open]Linux also includes some tools for displaying mathematical figures in several data intervals. One of these programs is "gnuplot" :

[root@heaven TeX]# gnuplot

        G N U P L O T
        Linux version 3.5 (pre 3.6)
        patchlevel beta 242
        last modified Thu Dec 7 22:45:00 GMT 1995

        Copyright(C) 1986 - 1995
        Thomas Williams, Colin Kelley and many others

        Send comments and requests for help to info-gnuplot@dartmouth.edu
        Send bugs, suggestions and mods to bug-gnuplot@dartmouth.edu

Terminal type set to 'x11'
gnuplot ?
 `gnuplot` is a command-driven interactive function and data plotting program.

 For help on any topic, type `help` followed by the name of the topic.  If the
 precise name of the topic is not known, type `help` and a menu will be given.
 Typing a question mark `?` after any `help` prompt will cause the menu to be
 listed again.

 The new `gnuplot` user should begin by reading the `introduction` topic (see
 `help introduction`) and the `plot` topic (see `help plot`).  Additional help
 can be obtained from the USENET newsgroup comp.graphics.apps.gnuplot.
 

Help topics available:
      autoscale      binary         bugs           call
      cd             clear          co-ordinates   comments
      copyright      environment    exit           expressions
      fit            help           if             introduction
      line-editing   load           pause          plot
      print          pwd            quit           replot
      reread         reset          save           seeking-assistance
      set            shell          show           splot
      startup        substitution   syntax         test
      update         userdefined
Press return for more:

Run the gnuplot command :

gnuplot plot sin(x)

and you will see the function "sin(x)"!




As we can see, TeX and its different sets of macros and programs like gnuplot are very useful in schools and universities.

Another program available from long time is "xfig".


Press the button to go to the next section.