background image
78
Chapter 4
Hypertext Markup Language (HTML)
HTML files contain the codes for formatting web pages (and email). HTML
is essentially ASCII text with formatting tags neatly separated by angle brackets
from the rest of the text. Today, HTML files can contain code such as Java-
Script, which hackers can use to link to other websites and embed viruses,
spyware, adware, and other bad things, but plain HTML (without code) is
safe. Unfortunately, Word doesn't create a plain HTML file--it adds a lot of
stuff you probably won't want or need (see Figure 4-3).
Figure 4-3: Word puts all this junk at the top of every HTML file it creates, guaranteeing that
you won't know what the heck is going on inside. You can get rid of most of it and use the
standard HTML codes for defining a header and title.
Aliens Kidnapped My Characters!
If you've ever created an ASCII text file from a Word doc, or opened one in
Word or some other program, you probably ran into problems with Word's
special characters. What's so special about em dashes, single and double
A S C I I A R T
Ever see those "art" printouts of ASCII characters arranged so that they appear, at
a distance, to be a beautiful nude woman? Artists and amateurs have applied the
character set to works that range from interesting emoticons (beyond the smiley :-)
face) to extreme photo-realistic digitizations of portraits. Check out ASCII Artwork
(www.textfiles.com/art) for a gallery of ASCII art.
jsntm_02.book Page 78 Wednesday, September 28, 2005 1:10 PM
No Starch Press
© 2005 by Tony Bove