CTAN Comprehensive TeX Archive Network

What are and its friends?

and as­so­ci­ated pro­grams such as is a sys­tem for com­puter type­set­ting of doc­u­ments. It is well known for its skill with math­e­mat­i­cal and sci­en­tific text and other dif­fi­cult type­set­ting jobs:  long or com­pli­cated and multi-lin­gual doc­u­ments.

sys­tems pro­duce out­put — on pa­per or on the com­puter screen — of the high­est ty­po­graphic qual­ity. This qual­ity is cru­cial for com­plex texts, where the reader's abil­ity to un­der­stand the ma­te­rial de­pends on the clar­ity with which it is pre­sented. is Free soft­ware. It is avail­able on al­most ev­ery com­puter that peo­ple are us­ing to­day. For the many ad­van­tages of the see be­low.

Due to these ad­van­tages, sys­tems are now the stan­dard com­mu­ni­ca­tion tool in the sci­ences. For in­stance, has been adopted by the Amer­i­can Math­e­mat­i­cal So­ci­ety and many other pro­fes­sional so­ci­eties as their pre­ferred for­mat. It is also widely used in other aca­demic ar­eas, in the hu­man­i­ties, and the so­cial sci­ences.


Don­ald E. Knuth

The project was started in 1978 by Don­ald E. Knuth, while re­vis­ing the sec­ond vol­ume of his Art of Com­puter Pro­gram­ming. When he got the gal­leys back, he saw that the pub­lisher had switched to a new dig­i­tal type­set­ting sys­tem and was shocked at the poor qual­ity.

He rea­soned that be­cause dig­i­tal type­set­ting meant ar­rang­ing 1's and 0's (ink and no ink) in the proper pat­tern, as a com­puter sci­en­tist he should be able to make a bet­ter job of it. He orig­i­nally es­ti­mated that this would take six months but ul­ti­mately it took nearly ten years. He had to han­dle not only the chal­lenges of rou­tine type­set­ting such as right-jus­ti­fi­ca­tion and page for­mat­ting flex­i­ble enough to al­low for dif­fer­ent out­put styles, but also the ad­di­tional de­mands of aca­demic pub­lish­ing – foot­notes, float­ing fig­ures and ta­bles, etc. And, be­yond that, he had to tell the com­puter how to type­set for­mu­las and other tech­ni­cal ma­te­rial.

A year af­ter he had be­gun, Knuth was in­vited to give one of the prin­ci­pal lec­tures at the AMS's an­nual meet­ing. He spoke on his work on , pre­sent­ing not only the ty­po­graph­i­cal as­pects but also the math­e­mat­i­cal ideas be­hind the pro­grams. 's pop­u­lar­ity took off from there.

An im­por­tant boost to that pop­u­lar­ity came in 1985 with the in­tro­duc­tion of by Les­lie Lam­port. This is a set of com­mands that al­lows au­thors to in­ter­act with the sys­tem at a higher level than Knuth's orig­i­nal com­mand set (called Plain ). Another boost came in 1990, when Hans Ha­gen in­tro­duced Cont to the world; Cont started with sim­i­lar aims to (though a dif­fer­ent fo­cus from) , and is also very widely used. Both and Cont have large and ac­tive user com­mu­ni­ties.

To­day, sys­tems con­tinue to be a well-known stan­dard. They are used ev­ery day for re­search preprints, drafts of text­books, and con­fer­ence pro­ceed­ings. And, ac­tive de­vel­op­ment of the soft­ware con­tin­ues. Mem­bers of the com­mu­nity con­tribute a steady stream of new and up­dated en­hance­ment pack­ages, there have been great im­prove­ments in 's font-han­dling, and also im­prove­ments in 's abil­ity with multi-lin­gual texts, there is now a ver­sion of that out­puts di­rectly to PDF for­mat, an ex­ten­sion which seam­lessly uses cur­rent font for­mats, and much more.

Why ?

This sec­tion de­scribes some of the ma­jor ad­van­tages of sys­tems.

Com­pared to word pro­ces­sors

Most peo­ple have used a word pro­ces­sor, so a com­par­i­son may be help­ful.

With a word pro­ces­sors your text is placed while you type it, re­ferred to as “what you see is what you get.” In con­trast, is a for­mat­ter: it sep­a­rates the steps of en­ter­ing the ma­te­rial and plac­ing it on the page.

To see the dif­fer­ence, con­sider how a typ­i­cal user of each sys­tem might start a new sec­tion. In a word pro­ces­sor a typ­i­cal user might start that sec­tion by hit­ting <Enter> twice to get two lines of ver­ti­cal space, typ­ing “Section 1.2: New results”, click­ing to high­light that text, click­ing to se­lect a larger type size, click­ing to se­lect a new type style, and fi­nally en­ter­ing two more lines of ver­ti­cal space. A typ­i­cal user user will type into a file the line “\section{New results}”. That is, a word pro­cess­ing user is for­mat­ting the text as they en­ter it, while the user de­scribes the mean­ing of the text and later will for­mat it.

Begin­ners like word pro­cess­ing but when they grad­u­ate to com­plex jobs the ap­peal fades. Word pro­cess­ing a twenty page tech­ni­cal ar­ti­cle is hard; for in­stance, keep­ing the ver­ti­cal space be­tween sec­tions uni­form is er­ror-prone, and so is mak­ing sure that all of the bib­li­o­graphic en­tries fol­low the re­quired for­mat. In par­tic­u­lar, very few peo­ple have both the knowl­edge and the eye to cor­rectly lay out equa­tions — peo­ple of­ten say their equa­tions “just don't look right.” That is, as a user be­comes more ex­pe­ri­enced and knowl­edgable the ap­proach of hav­ing the type­set­ting done by the pro­gram be­comes the bet­ter choice. (Some word pro­ces­sors of­fer as ad­vanced fea­tures -like fa­cil­i­ties for or­ga­niz­ing in­put text, al­though few users take ad­van­tage of them.)

We'll give you ten good rea­sons ...

Th­ese are the rea­sons most of­ten cited for us­ing , grouped into four ar­eas: Out­put Qual­ity, Su­pe­rior Engi­neer­ing, Free­dom, and Pop­u­lar­ity.

Out­put Qual­ity
You write doc­u­ments to be read. Your first con­cern should be: how good is the out­put?
1. has the best out­put.

What you'll end up with is of the high­est qual­ity that a non-pro­fes­sional can pro­duce.

This is es­pe­cially holds for com­plex doc­u­ments such as ones with math­e­mat­ics; see this sam­ple from Rogers's Re­cur­sive Func­tions. It also holds for doc­u­ments that are com­plex in other ways: with many ta­bles, or many cross ref­er­ences or hy­per-links, or just with many pages.

Even on sim­ple doc­u­ments does a bet­ter job than a word pro­ces­sor. Com­pare these sam­ples of plain text from Herigel's Zen in the Art of Archery done in the word pro­ces­sor Word, and . Th­ese are short and the ty­po­graphic dif­fer­ences are sub­tle but even a non-ex­pert will sense that the page looks bet­ter. For in­stance, the word pro­ces­sor's page has some lines with wide gaps be­tween words and some lines with too many words stuffed in; con­trast the sec­ond para­graph's sec­ond line with its third. 's out­put is bet­ter.

2. knows about type­set­ting.

As those plain text sam­ples show, 's has more so­phis­ti­cated ty­po­graph­i­cal al­go­rithms such as those for mak­ing para­graphs and for hy­phen­at­ing.

's ex­per­tise comes into its own in set­ting tech­ni­cal ma­te­rial. lets the soft­ware han­dle this task as much as is pos­si­ble. For in­stance, it au­to­mat­i­cally clas­si­fies each math­e­mat­i­cal sym­bol as a vari­able, or a re­la­tion, etc., and sets them with ap­pro­pri­ate amounts of sur­round­ing space. It also sizes su­per­scripts and sub­scripts, rad­i­cals, brack­ets, and many other things. The re­sult is that, be­cause your doc­u­ment fol­lows the con­ven­tions of pro­fes­sional type­set­ting, your read­ers will know ex­actly what you mean. You al­most never have to fret about the for­mu­las. They just come out look­ing right.

The qual­ity of out­put is the best rea­son to use .

Su­pe­rior Engi­neer­ing
Every­one has been frus­trated with soft­ware that is slow, fat, buggy, or that un­der­goes fre­quent in­com­pat­i­ble ver­sion changes. will not give you those trou­bles; from a Com­puter Science stand­point, is very im­pres­sive.
3. is fast.

On to­day's ma­chines is very fast. It is easy on mem­ory and disk space, too.

4. is sta­ble.

It is in wide use, with a long his­tory. It has been tested by mil­lions of users, on de­mand­ing in­put. It will never eat your doc­u­ment. Never.

But there is more here than just that the pro­gram is re­li­able. 's de­signer has frozen the cen­tral en­gine, the ac­tual tex pro­gram. Doc­u­ments that run to­day will still run in ten years, or fifty. So “sta­ble” means more than that it ac­tu­ally works. It means that it will con­tinue to work, for­ever.

5. is sta­ble, but not rigid.

A sys­tem locked into 1978's tech­nol­ogy would have gaps to­day. That's why is ex­tendible, so that in­no­va­tions can be added on.

An ex­am­ple is the macro pack­age, which is the most pop­u­lar way to use to­day. It ex­tends by adding con­ve­nience fea­tures such as au­to­matic cross ref­er­ences, sec­tion­ing, in­dex­ing, a ta­ble of con­tents, au­to­matic num­ber­ing of chap­ters, sec­tions, the­o­rems, etc., in a va­ri­ety of styles, and a straight­for­ward but pow­er­ful way to make ta­bles.

also con­tains many fea­tures that en­cour­age au­thors to struc­ture doc­u­ments by mean­ing rather than by ap­pear­ance. For in­stance, a au­thor might in­di­cate an acronym as “the \ac{Tree Based Hashing} method.” This ap­proach has two ad­van­tages. First, since \ac is run by a com­puter and not hand-en­tered by a per­son, we can rely on type style, size, etc., be­ing the same through­out the doc­u­ment; in this case the first time the acronym is used in the pa­per it will ap­pear as “Tree Based Hash­ing (TBH)” while in later times “TBH” will be all that is shown. Se­cond, once in­for­ma­tion on the mean­ing is in the com­puter then we can do more with it, per­haps by pro­duc­ing an in­dex of acronyms.

And, it­self can be ex­tended. There are thou­sands of “style files” that do ev­ery­thing from adapt­ing the ba­sics to the needs of the Amer­i­can Math So­ci­ety, to mak­ing cross-ref­er­ences into hy­per-ref­er­ences, all the way to al­low­ing you to add epigraphs, the short quo­ta­tions that some­times dec­o­rate the start or end of a chap­ter.

6. The in­put is plain text.

's source files are portable to any com­put­ing plat­form. They are com­pact; for in­stance, all of the files for a 450 page text­book and 125 page an­swer sup­ple­ment fit eas­ily on one floppy disk. And, they in­te­grate with other tools such as search pro­grams.

Use of this type of in­put file stems from 's roots in the world of sci­ence and en­gi­neer­ing where there is a tra­di­tion of close co­op­er­a­tion among col­leagues. A bi­nary in­put for­mat, es­pe­cially a pro­pri­etary one, is bad for co­op­er­a­tion: you prob­a­bly have had to go through the trou­ble of up­grad­ing a word pro­ces­sor ver­sion be­cause cowork­ers up­graded and you could no longer read their files. With sys­tems that rarely hap­pens — the last time that a 9 re­lease lost even a small amount of com­pat­i­bil­ity was in 1995.

Another ad­van­tage of plain text is that the text may be au­to­mat­i­cally gen­er­ated, for in­stance if it is drawn out of a database for a re­port. Get­ting a word pro­ces­sor into that work flow is a chal­lenge. But fits right in.

There are even ways to run di­rectly from XML in­put, which many peo­ple think is the stan­dard in­put for­mat of the fu­ture. So, with the en­gine in the mid­dle the in­put may be ad­justed to meet your needs, and chang­ing times.

7. The out­put can be any­thing.

As with in­putting, 's out­putting step is sep­a­rate from its type­set­ting. The en­gine's re­sults can be con­verted to a printer lan­guage such as PostScript or to PDF or HTML, or, prob­a­bly, to what­ever will ap­pear in the fu­ture. And, the type­set­ting — line breaks, etc. — will be the same no mat­ter where your out­put ap­pears. (Did you know that word pro­cess­ing out­put de­pends on the printer's fonts, so that if you email your work to some­one with a dif­fer­ent printer then for them the line and page breaks are likely to come out dif­fer­ently?)

Many peo­ple find that 's in­put lan­guage fits with how they think about their ma­te­rial. For in­stance, a sci­en­tist might de­scribe a for­mula to a col­league over a tele­phone us­ing con­structs.

Most com­puter users have heard about Free and Open-Sourced soft­ware and know that, as with the GNU pro­grams, Linux, Apache, Perl, etc., this style of de­vel­op­ment can yield soft­ware that is first class. sys­tems fall into this cat­e­gory.
8. is free.

The source of the main tex en­gine is open; the Free Soft­ware Foun­da­tion uses it for their doc­u­ments. All of the other main com­po­nents are open, also.

9. runs any­where.

What­ever plat­form meets your needs — Win­dows, Mac­in­tosh, a va­ri­ety of Unix, or al­most any­thing else — you can get , ei­ther freely dis­tributed or in a com­mer­cial ver­sion.

So al­though the core of was writ­ten some time ago it fits well with to­day's trends.

Us­ing the same sys­tem as many other peo­ple has ad­van­tages. For in­stance, you can get an­swers to your ques­tions. And, be­cause of this large user base, your sys­tem is sure to be around for years.
10. is the stan­dard.

Most sci­en­tists, es­pe­cially aca­demic sci­en­tists, know . Re­search preprints, drafts of text­books, and con­fer­ence pro­ceed­ings, all are reg­u­larly pro­duced with . As a re­sult, many pub­lish­ers of tech­ni­cal ma­te­rial are set up to work with it.

Be­cause it is the stan­dard, 's sup­port by other tech­ni­cal soft­ware is the best. For ex­am­ple, there are edit­ing modes to make in­put con­ve­nient, such as AUC for Emacs. Another ex­am­ple is that all ma­jor com­puter al­ge­bra sys­tems, such as SAGE, Max­ima, etc., will give out­put in . And no doubt tech­ni­cal soft­ware de­vel­oped in the fu­ture will sup­port .

In ad­di­tion, is used by many peo­ple out­side of the sci­ences, for all of the rea­sons given in this doc­u­ment. For in­stance, there is a way to pro­duce beau­ti­ful crit­i­cal edi­tion texts.

Hav­ing to use a bad sys­tem sim­ply be­cause it is pop­u­lar would be sad. But nonethe­less, the ex­is­tence of such a base is it­self one ar­gu­ment in fa­vor of a soft­ware pack­age.


In Sum­mary …

is a type­set­ting sys­tem that pro­duces pub­li­ca­tion-qual­ity out­put, even for dif­fi­cult ma­te­rial such as math­e­mat­ics. It is freely avail­able. Its de­sign makes it shine in ar­eas where the sys­tem fa­mil­iar to most be­gin­ning com­puter users, word pro­ces­sors, falls short. Briefly, that is, it was de­signed well.

Guest Book Sitemap Contact Contact Author