This is a primer for producing documents in HTML, the markup languageused by the World Wide Web.
This primer assumes that you have:
HTML documents are in plain (also known as ASCII) text format and canbe created using any text editor (e.g., Emacs or vi on UNIX machines).A couple of Web browsers (tkWWW for X Window System machines and CERN'sWeb browser for NeXT computers) include rudimentary HTML editors ina WYSIWYG environment. There are also some WYSIWIG editors availablenow (e.g. HotMetal for Sun Sparcstations, HTML Edit for Macintoshes).You may wish to try one of them first before delving into the detailsof HTML.
You can preview a document in progress with NCSA Mosaic (and some other Web browsers). Open it with the Open Local command under the File menu.After you edit the source HTML file, save the changes. Return to NCSA Mosaic and Reload the document. The changes are reflected in the on- screen display.
Here is a bare-bones example of HTML:
<TITLE>The simplest HTML example</TITLE> <H1>This is a level-one heading</H1> Welcome to the world of HTML. This is one paragraph.<P> And this is a second.<P>Click here to see the formatted versionof the example.
HTML uses markup tags to tell the Web browser how to display the text.The above example uses:
HTML tags consist of a left angle bracket (<), (a ``lessthan'' symbol to mathematicians), followed by name of the tag and closedby a right angular bracket (>). Tags are usually paired,e.g. <H1> and </H1>. The endingtag looks just like the starting tag except a slash (/) precedes thetext within the brackets. In the example, <H1> tellsthe Web browser to start formatting a level-one heading; </H1>tells the browser that the heading is complete.
The primary exception to the pairing rule is the <P>tag. There is no such thing as </P>.
NOTE: HTML is not case sensitive. <title>is equivalent to <TITLE> or <TiTlE>.
Not all tags are supported by all World Wide Web browsers. If a browserdoes not support a tag, it just ignores it.
Every HTML document should have a title. A title is generally displayedseparately from the document and is used primarily for document identificationin other contexts (e.g., a WAIS search). Choose about half a dozenwords that describe the document's purpose.
In the X Window System and Microsoft Windows versions of NCSA Mosaic, the Document Title field is at the top of the screen just below the pulldown menus. In NCSA Mosaic for Macintosh, text tagged as <TITLE> appears as the window title.
HTML has six levels of headings, numbered 1 through 6, with 1 beingthe most prominent. Headings are displayed in larger and/or bolderfonts than normal body text. The first heading in each document shouldbe tagged <H1>. The syntax of the heading tag is:
<Hy>Text of heading</Hy >
where y is a number between 1 and 6 specifying the levelof the heading.
For example, the coding for the ``Headings'' section heading aboveis
<H3>Headings</H3>
In many documents, the first heading is identical to the title. Formultipart documents, the text of the first heading should be suitablefor a reader who is already browsing related information (e.g., a chaptertitle), while the title tag should identify the document in a widercontext (e.g., include both the book title and the chapter title, althoughthis can sometimes become overly long).
Unlike documents in most word processors, carriage returns in HTMLfiles aren't significant. Word wrapping can occur at any point in yoursource file, and multiple spaces are collapsed into a single space.(There are couple of exceptions; space following a <P>or <Hy> tag, for example,is ignored.) Notice that in the bare-bones example, the first paragraphis coded as
Welcome to HTML. This is the first paragraph. <P>
In the source file, there is a line break between the sentences. AWeb browser ignores this line break and starts a new paragraph onlywhen it reaches a <P> tag.
Important: You must separate paragraphs with <P>.The browser ignores any indentations or blank lines in the source text.HTML relies almost entirely on the tags for formatting instructions,and without the <P> tags, the document becomes onelarge paragraph. (The exception is text tagged as ``preformatted,''which is explained below.) For instance, the following would produceidentical output as the first bare-bones HTML example:
<TITLE>The simplest HTML example</TITLE><H1>This is a level one heading</H1>Welcome to the world of HTML. This is one paragraph.<P>And this is a second.<P>
However, to preserve readability in HTML files, headings should beon separate lines, and paragraphs should be separated by blank lines(in addition to the <P> tags).
NCSA Mosaic handles <P> by ending the current paragraph and inserting a blank line.
In HTML+, a successor to HTML currently in development, <P>becomes a ``container'' of text, just as the text of a level-one headingis ``contained'' within<H1> ... </H1>:
<P> This is a paragraph in HTML+. </P>
The difference is that the </P> closing tag canalways be omitted. (That is, if a browser sees a <P>,it knows that there must be an implied </P> to endthe previous paragraph.) In other words, in HTML+, <P>is a beginning-of-paragraph marker.
The advantage of this change is that you will be able to specify formattingoptions for a paragraph. For example, in HTML+, you will be able tocenter a paragraph by coding
<P ALIGN=CENTER> This is a centered paragraph. This is HTML+, so you can't do it yet.
This change won't effect any documents you write now, and they willcontinue to look just the same with HTML+ browsers.
The chief power of HTML comes from its ability to link regions of text(and also images) to another document. The browser highlights theseregions (usually with color and/or underlines) to indicate that theyare hypertext links (often shortened to hyperlinks or simplylinks).
HTML's single hypertext-related tag is <A>, whichstands for anchor. To include an anchor in your document:
A.)Here is an sample hypertext reference:
<A HREF="MaineStats.html">Maine</A>
This entry makes the word ``Maine'' the hyperlink to the document MaineStats.html,which is in the same directory as the first document. You can linkto documents in other directories by specifying the relative pathfrom the current document to the linked document. For example, a linkto a file NJStats.html located in the subdirectory AtlanticStateswould be:
<A HREF="AtlanticStates/NJStats.html">New Jersey</A>
These are called relative links. You can also use the absolutepathname of the file if you wish. Pathnames use the standard UNIX syntax.
In general, you should use relative links, because
However, use absolute pathnames when linking to documents that arenot directly related. For example, consider a group of documents thatcomprise a user manual. Links within this group should be relativelinks. Links to other documents (perhaps a reference to related software)should use full path names. This way, if you move the user manual toa different directory, none of the links would have to be updated.
The World Wide Web uses Uniform Resource Locators (URLs) to specifythe location of files on other servers. A URL includes the type ofresource being accessed (e.g., gopher, WAIS), the address of the server,and the location of the file. The syntax is:
scheme://host.domain where scheme is one of The port number can generally be omitted. (That
means unlesssomeone tells you otherwise, leave it out.) For example, to include a
link to this primer in your document, youwould use This would make the text
``NCSA's Beginner's Guide to HTML'' a hyperlinkto this document. For more
information on URLs, look at Anchors can also be used to move to a particular
section in a document.Suppose you wish to set a link from document A and a specific
sectionin document B. (Call this file documentB.html.) Firstyou
need to set up a named anchor in document B. For example,to set up an
anchor named ``Jabberwocky'' to document B, enter Now when you create
the link in document A, include not only the filename,but also the named anchor,
separated by a hash mark (#). Now clicking on the word ``link'' in document A sends the reader
directlyto the words ``some text'' in document B. The technique is exactly the same except the filename is
omitted. For example, to link to the Jabberwocky anchor from within the samefile
(Document B), use The
preceding is sufficient to produce simple HTML documents. For morecomplex
documents, HTML has tags for several types of lists, preformattedsections, extended
quotations, character formatting, and other items. HTML supports unnumbered, numbered, and definition
lists. To make an
unnumbered list, Below an example two-item list:
The output is: The <LI> items can contain multiple
paragraphs.Just separate the paragraphs with the <P>
paragraphtags. A
numbered list (also called an ordered list, from which the tag namederives) is identical to
an unnumbered list, except it uses <OL>instead of
<UL>. The items are tagged using thesame
<LI> tag. The following HTML code produces this formatted output: A definition list usually consists of alternating a term (abbreviatedas
DT) and a definition (abbreviated as DD).Web
browsers generally format the definition on a new line. The following is an example
of a definition list: The output
looks like: The <DT> and <DD>
entries cancontain multiple paragraphs (separated by <P>
paragraphtags), lists, or other definition information. Lists can be arbitrarily nested, although in
practice you probablyshould limit the nesting to three levels. You can also have a
numberof paragraphs, each containing a nested list, in a single list item. An
example nested list: The nested list is displayed as
Use
the <PRE> tag (which stands for ``preformatted'')to generate
text in a fixed-width font and cause spaces, new lines,and tabs to be significant. (That is,
multiple spaces are displayedas multiple spaces, and lines break in the same locations as in
thesource HTML file.) This is useful for program listings. For example,the following
lines display as Hyperlinks can be
used within <PRE> sections. Youshould avoid using other
HTML tags within <PRE>sections, however. Note that
because <, >, and & have special meaning in HTML,you have to use their
escape sequences (<, >,and
&, respectively) to enter these characters.See the section Special Characters for more information. Use the
<BLOCKQUOTE> tag to include quotations ina separate block
on the screen. Most browsers generally indent to separateit from surrounding
text. An example: The result is: I have a
dream that one day this nation will rise up and live out the true meaning of its creed. We
hold these truths to be self-evident that all men are created equal.
The
<ADDRESS> tag is generally used to specify theauthor of a
document and a means of contacting the author (e.g., anemail address). This is usually the
last item in a file. For example, the last line of the online version of this guide is
The result
is NOTE:
<ADDRESS> is notused for postal addresses. See
``Forced Line Breaks'' on page 10 tosee how to format postal addresses. You can code individual words
or sentences with special styles. Thereare two types of styles: logical and physical.
Logical stylestag text according to its meaning, while physical
stylesspecify the specific appearance of a section. For example, in the
precedingsentence, the words ``logical styles'' was tagged as a ``definition.''The same
effect (formatting those words in italics), could have beenachieved via a different tag that
specifies merely ``put these wordsin italics.'' If physical and
logical styles produce the same result on the screen,why are there both? We devolve, for a
couple of paragraphs, into thephilosophy of SGML, which can be summed in a Zen-like
mantra: ``Trustyour browser.'' In the ideal SGML universe, content is divorced
from presentation.Thus, SGML tags a level-one heading as a level-one heading, but
doesnot specify that the level-one heading should be, for instance, 24-pointbold Times
centered on the top of a page. The advantage of this approach(it's similar in concept to
style sheets in many word processors) isthat if you decide to change level-one headings to
be 20-point left-justifiedHelvetica, all you have to do is change the definition of the level-
oneheading in the presentation device (i.e., your World Wide Web browser). The
other advantage of logical tags is that they help enforce consistencyin your documents.
It's easier to tag something as <H1>than to remember that level-
one headings are 24-point bold Times orwhatever. The same is true for character styles.
For example, considerthe <STRONG> tag. Most browsers
render it in boldtext. However, it is possible that a reader would prefer that thesesections
be displayed in red instead. Logical styles offer this flexibility. To
apply a character style, Four characters of the ASCII character
set -- the left angle bracket(<), the right angle bracket (>), the ampersand (&)
and thedouble quote (") -- have special meaning within HTML and thereforecannot
be used ``as is'' in text. (The angle brackets are used to indicatethe beginning and end of
HTML tags, and the ampersand is used to indicatethe beginning of an escape
sequence.) To use one of these characters in an HTML document, you must
enterits escape sequence instead: Additional escape sequences support accented characters. For
example: A full list of supported
characters can be found at CERN. NOTE: Unlike
the rest of HTML, the escape sequencesare case sensitive. You cannot, for instance, use
< insteadof <. The <BR> tag forces a line break with
no extra spacebetween lines. (By contrast, most browsers format the
<P>paragraph tag with an additional blank line to more clearly
indicatethe beginning the new paragraph.) One use of
<BR> is in formatting addresses: The <HR> tag
produces a horizontal line the widthof the browser window. Most Web browsers can display in-line images
(that is, images nextto text) that are in X Bitmap (XBM) or GIF format. Each image
takestime to process and slows down the initial display of the document,so generally you
should not include too many or overly large images. To include an in-line image,
use where
image_URL is the URL of the image file. The syntaxfor IMG
SRC URLs is identical to that used in an anchorHREF. If
the image file is a GIF file, then the filenamepart of image_URL
must end with .gif.Filenames of
X Bitmap images must end with .xbm. Some World Wide Web browsers, primarily those that run on
VT100 terminals,cannot display images. The ALT option allows you
to specifytext to be displayed when an image cannot be. For example: where UpArrow.gif is the
picture of an upward pointingarrow. With NCSA Mosaic and other graphics-capable
viewers, the usersees the up arrow graphic. With a VT100 browser, such as lynx,
theuser sees the word ``Up.'' You may want to have an image open as a separate
document when a useractivates a link on either a word or a smaller, in-line version ofthe
image included in your document. This is considered an externalimage and is useful if you
do not wish to slow down the loading ofthe main document with large in-line
images. To include a reference to an external image, use Use the same
syntax is for links to external animations and sounds.The only difference is the file
extension of the linked file. For example, <A HREF =
"QuickTimeMovie.mov">link anchor</A> specifies
a link to a QuickTime movie. Some common file types and theirextensions are: Make
sure your intended audience has the necessary viewers. Most UNIXworkstations, for
instance, cannot view QuickTime movies. Consider this snippet of HTML: The word ``overlapping'' is contained within both the
<B>and <DFN> tags. How does the
browser format it?You won't know until you look, and different browsers will likely
reactdifferently. In general, avoid overlapping tags. It is
acceptable to embed anchors within another HTML element: Do not embed a heading or
another HTML element within an anchor: Although most browsers currently handle this example, it is
forbiddenby the official HTML and HTML+ specifications, and will not work withfuture
browsers. Character tags modify the appearance of other tags: However,
avoid embedding other types of HTML element tags. For example,it is tempting to embed
a heading within a list, in order to make thefont size larger: Although some browsers, such as NCSA Mosaic for the X
Window System,format this construct quite nicely, it is unpredictable (because itis
undefined) what other browsers will do. For compatibility with allbrowsers, avoid these
kinds of constructs. What's the difference between embedding a
<B> withina <LI> tag as opposed to
embedding a <H1>within a <LI>?
This is again a question of SGML.The semantic meaning of
<H1> is that it's the mainheading of a document and that it
should be followed by the contentof the document.Thus it doesn't make sense to find a
<H1>within a list. Character formatting tags also are
generally not additive. You mightexpect that would produce bold-italic text. On some
browsers it does; other browsersinterpret only the innermost tag (here, the
italics). When an
<IMG> tag points at an image that does notexist, a dummy
image is substituted. When this happens, make sure thatthe referenced image does in fact
exist, that the hyperlink has thecorrect information in the URL, and that the file
permission is setappropriately (world-readable). Here is a longer example of an HTML document: In addition to tags already discussed, this example also uses the
<HEAD>... </HEAD> and <BODY> ...
</BODY>tags, which separate the document into introductory information
aboutthe document and the main text of the document. These tags don't changethe
appearance of the formatted document at all, but are useful forseveral purposes (for
example, NCSA Mosaic for Macintosh 2.0, for example,allows you to browse just the
header portion of document before decidingwhether to download the rest), and it is
recommended that you use thesetags. This guide is only an introduction to HTML and not a
comprehensivereference. Below are additional sources of information. One major feature not discussed here is fill-
out forms, which allowsusers to return information to the World Wide Web server. For
informationon fill-out forms, look at this Fill-outForms Overview The following offer advice on how to write
``good'' HTML: <A HREF =
"http://www.ncsa.uiuc.edu/General/Internet/WWW/HTMLPrimer.html">
NCSA's Beginner's Guide to HTML</A>
Links to Specific Sections
in Other Documents
Here's <A NAME =
"Jabberwocky">some text</a>
This is my <A HREF =
"documentB.html#Jabberwocky">link</A> to document
B.
Links to Specific Sections Within the Current
Document
This is <A HREF =
"#Jabberwocky">Jabberwocky link</A> from within Document
B.
Additional Markup Tags
Lists
Unnumbered Lists
<UL> <LI> apples <LI> bananas
</UL>
Numbered Lists
<OL>
<LI> oranges <LI> peaches <LI> grapes
</OL>
Definition Lists
<DL> <DT> NCSA <DD> NCSA,
the National Center for Supercomputing Applications, is located on the campus of
the University of Illinois at Urbana-Champaign. NCSA is one of the participants in
the National MetaCenter for Computational Science and Engineering. <DT>
Cornell Theory Center <DD> CTC is located on the campus of Cornell University
in Ithaca, New York. CTC is another participant in the National MetaCenter
for Computational Science and Engineering. </DL>
Nested Lists
<UL> <LI> A few New England states:
<UL> <LI> Vermont <LI> New Hampshire
</UL> <LI> One Midwestern state: <UL> <LI>
Michigan </UL> </UL>
Preformatted Text
<PRE> #!/bin/csh cd $SCR
cfs get mysrc.f:mycfsdir/mysrc.f cfs get myinfile:mycfsdir/myinfile fc
-02 -o mya.out mysrc.f mya.out cfs save
myoutfile:mycfsdir/myoutfile rm *
</PRE> #!/bin/csh cd
$SCR cfs get mysrc.f:mycfsdir/mysrc.f cfs get
myinfile:mycfsdir/myinfile fc -02 -o mya.out mysrc.f mya.out
cfs save myoutfile:mycfsdir/myoutfile rm *Extended Quotations
<BLOCKQUOTE> I still have a dream. It
is a dream deeply rooted in the American dream. <P> I have a dream that one
day this nation will rise up and live out the true meaning of its creed. We hold these
truths to be self-evident that all men are created equal. <P>
</BLOCKQUOTE>
I still have
a dream. It is a dream deeply rooted in the American dream.
Addresses
<ADDRESS> A Beginner's Guide to HTML / NCSA /
pubs@ncsa.uiuc.edu </ADDRESS>
Character Formatting
Physical
Versus Logical: Use Logical Styles When Possible
Logical Styles
Physical Styles
Using Character Tags
Special Characters
Escape Sequences
Forced Line
Breaks
National Center
for Supercomputing Applications<BR> 605 East Springfield Avenue<BR>
Champaign, Illinois 61820-5518<BR>
Horizontal Rules
In-line Images
<IMG SRC=image_URL>
By default the bottomof an image is aligned with
the text as shown in this paragraph.
Add the ALIGN=TOPoption if you want the
browser to align adjacent text with the topof the image as shown in this paragraph. The
full in-line image tagwith the top alignment is: <IMG ALIGN=top
SRC=image_URL>
ALIGN=MIDDLEaligns the text with the center of the
image.Alternate Text for Browsers That Can't Display
Images
<IMG SRC = "UpArrow.gif" ALT =
"Up">
External Images, Sounds,
and Animations
<A HREF
= image_URL>link anchor</A>
Troubleshooting
Avoid Overlapping
Tags
<B>This is an
example of <DFN>overlapping</B> HTML
tags.</DFN>
Embed Anchors and Character Tags, But Nothing Else
<H1><A
HREF = "Destination.html">My
heading</A></H1>
<A HREF =
"Destination.html"> <H1>My heading</H1>
</A>
<UL><LI><B>A bold list item</B> <UL>
<LI><I>An italic list item</I> </UL>
<UL><LI><H1>A large heading</H1> <UL>
<LI><H2>Something slightly smaller</H2>
</UL>
<B><I>some
text</I></B>
Check Your Links
A Longer
Example
<HEAD> <TITLE>A Longer Example</TITLE> </HEAD>
<BODY> <H1>A Longer Example</H1> This is a simple HTML
document. This is the first paragraph. <P> This is the second paragraph, which
shows special effects. This is a word in <I>italics</I>. This is a word in
<B>bold</B>. Here is an in-lined GIF image: <IMG SRC =
"myimage.gif">. <P> This is the third paragraph, which
demonstrates links. Here is a hypertext link from the word <A HREF =
"subdir/myfile.html">foo</A> to a document called
"subdir/myfile.html". (If you try to follow this link, you will get an error
screen.) <P> <H2>A second-level header</H2> Here is a section of
text that should display as a fixed-width font: <P> <PRE> On the
stiff twig up there Hunches a wet black rook Arranging and rearranging its
feathers in the rain ... </PRE> This is a unordered list with two items:
<P> <UL> <LI> cranberries <LI> blueberries
</UL> This is the end of my example document. <P>
<ADDRESS>Me (me@mycomputer.univ.edu)</ADDRESS>
</BODY>
Click here to see the
formatted version.For More
Information
Fill-out Forms
Style Guides
Other Introductory
Documents
These cover similar information as this guide:Additional References
National Center for
Supercomputing Applications / pubs@ncsa.uiuc.edu