Friday, August 15, 2008

Thing to Know Before You Code, Part Two: HTML

Thing to Know Before You Code, Part Two: HTML

Now, we dive into the part that scares the bejebbers out of most folks: the coding.

Open up Notepad (or slide on over to the Real-Time HTML Editor).

Copy the below text and paste it into Notepad. Save the file as "index.html".

<title>my first page</title>
<p>Well, look at me! I made a web page!</p>

Go to the directory in your computer where you saved "index.html" and double click on it--it should now open up in your browser and look something like:

Now that you've done the hard part and gotten started, let's go over what you did.

HTML stands for Hyper Text Markup Language. "Hyper" means it bounces, via links, from one page to another. "Text" is thrown in because this is all written in text, instead of like, ick, machine code (binary is *soooo* not my friend). "Markup" comes into play because you're using <tags> to "mark up" the stuff you're writing. "Language" just means that there is a method to the madness, a structure and a syntax. On the plus side, the vocabulary here is really limited, so there's not much you have to learn to get started "talking" in HTML.

So, on to the syntax!

Basic rules of HTML

  1. Use the vocabulary
    W3 Schools is a great reference put together by the guys writing the web development standards. However, I prefer the SitePoint HTML (beta) reference because the writing is clearer and there are comments for each article, which allow for peer review (and extra "how to use it" tips).
  2. Every "markup" has an opening tag and a closing tag, framed by less than & greater than brackets.
    <tag>stuff the tag wraps around</tag>, or in actual practice:
    <p>Everything between the "p" tag at the front and the "/p" tag at the back is part of one paragraph. If you wanted to separate this into two paragraphs, you would need to close the first "p" tag (with a "/p") and open up a new one.</p>
  3. "Markup" should be properly "nested"
    RIGHT: <b><em>Bold and Emphasized Text</em></b>
    WRONG: <b><em>Bold and Emphasized Text</b></em>

That "should be" hits a key point in html. So long as you're using the right tags, most browsers--where your web pages are interpreted--will understand the wrong example and show it the way the author intended -- well, for text formatting markup. When you start nesting tags that tell the browser about the framework of your page, getting it wrong screws up the page's format.

There are also these things called validators that are rather like automated proofreaders for your HTML coding. They're great for helping you figure out where the problems are in your pages--and, hey, everyone typos at some point, right?

One last point to make before we go back to explanations: Cascading Style Sheets (CSS) are for beautifying the page; HTML is for telling it like it is. Headings are headings, paragraphs are paragraphs, bulletted lists and block quotes and inline quotes and ... well, you get the picture. There are still some presentational tags in HTML, but these are being weeded out. Using these tags--font, strike-through, underline, et all--means setting yourself up to re-write your entire site when the support for these tags finally gets dropped. I'm going to keep it simple and show you the stuff off the latest implemented HTML standard--version 4.01.

So, back to what you did.

Every page should start with a DOCTYPE declaration. A DOCTYPE tells the browser, "Here are the rules this page plays by." Without the DOCTYPE, most modern browsers are going to assume your page was written in a free-for-all style, which means it'll treat the page like it came out of the 1990s. It's called "Quirks Mode" because each browser developed then had its own quirky way of interpreting HTML. It was icky--not as bad as machine code, but still icky.

Let's take a closer look:

The first part, "!DOCTYPE HTML PUBLIC" tells the browser that this is the DOCTYPE definition (Document Type Definition, a.k.a. DTD), it's for an HTML page, and the definition is a public definition--not something proprietary (that would be SYSTEM and the next part would be omitted). The second part, " "-//W3C//DTD HTML 4.01//EN" ", tells it that the W3C (World Wide Web Consortium) drafted the definition, that it's a definition for the 4.01 version of HTML, and developed in English. The last part ("") is the URL pointing to where the machine-readable DTD specifications are hosted.

By the by, the DOCTYPE tag is the only tag that you don't close in proper HTML.

Next up are the <html> tags. The opening "html" tag goes just under the DOCTYPE declaration and the closing "/html" goes at the very end of the web page. No exceptions (especially if you're still learning =D). These tags tell the browser that everything in the page is web page stuff.

The "head" tags come next. Inside the head, you should always have at least a title for the page, contained in opening and closing "title" tags. You also want to drop your "meta" tags in here. Meta data is data about data. More will be made of these tags in a later article. Not least, links to accessory files, like your style sheet and any web scripts, get planted in the head. The "/head" tag closes before the "body" tag opens -- rather like the chin coming before the torso.

All of your content, all the beautiful, wonderful, BUY-MY-BOOKS-NOW stuff goes inside your "body" tags. If you want the content to show up in the web browser, you put in between the body tags, which means your page structure is also going to go in here.

And that means, time for the next article.

No comments: