Differences Between XML and HTML

User definable tags Defined set of tags designed for web display
Content driven Format driven
End tags required for well formed documents End tags not required
Quotes required around attributes values Quotes not required
Slash required in empty tags Slash not required

Why should I use XML?

Here are a few reasons for using XML (in no particular order). Not all of these will apply to your own requirements, and you may have additional reasons not mentioned here (if so, please let the editor of the FAQ know!).
* XML can be used to describe and identify information accurately and unambiguously, in a way that computers can be programmed to ‘understand’ (well, at least manipulate as if they could understand).
* XML allows documents which are all the same type to be created consistently and without structural errors, because it provides a standardised way of describing, controlling, or allowing/disallowing particular types of document structure. [Note that this has absolutely nothing whatever to do with formatting, appearance, or the actual text content of your documents, only the structure of them.]
* XML provides a robust and durable format for information storage and transmission. Robust because it is based on a proven standard, and can thus be tested and verified; durable because it uses plain-text file formats which will outlast proprietary binary ones.
* XML provides a common syntax for messaging systems for the exchange of information between applications. Previously, each messaging system had its own format and all were different, which made inter-system messaging unnecessarily messy, complex, and expensive. If everyone uses the same syntax it makes writing these systems much faster and more reliable.
* XML is free. Not just free of charge (free as in beer) but free of legal encumbrances (free as in speech). It doesn’t belong to anyone, so it can’t be hijacked or pirated. And you don’t have to pay a fee to use it (you can of course choose to use commercial software to deal with it, for lots of good reasons, but you don’t pay for XML itself).
* XML information can be manipulated programmatically (under machine control), so XML documents can be pieced together from disparate sources, or taken apart and re-used in different ways. They can be converted into almost any other format with no loss of information.
* XML lets you separate form from content. Your XML file contains your document information (text, data) and identifies its structure: your formatting and other processing needs are identified separately in a stylesheet or processing system. The two are combined at output time to apply the required formatting to the text or data identified by its structure (location, position, rank, order, or whatever).

What is DOM and how does it relate to XML?

The Document Object Model (DOM) is an interface specification maintained by the W3C DOM Workgroup that defines an application independent mechanism to access, parse, or update XML data. In simple terms it is a hierarchical model that allows developers to manipulate XML documents easily Any developer that has worked extensively with XML should be able to discuss the concept and use of DOM objects freely. Additionally, it is not unreasonable to expect advanced candidates to thoroughly understand its internal workings and be able to explain how DOM differs from an event-based interface like SAX.

Why is XML such an important development?

It removes two constraints which were holding back Web developments:
1. dependence on a single, inflexible document type (HTML) which was being much abused for tasks it was never designed for;
2. the complexity of full question A.4, SGML, whose syntax allows many powerful but hard-to-program options.
XML allows the flexible development of user-defined document types. It provides a robust, non-proprietary, persistent, and verifiable file format for the storage and transmission of text and data both on and off the Web; and it removes the more complex options of SGML, making it easier to program for.

What is a markup language?

A markup language is a set of words and symbols for describing the identity of pieces of a document (for example ‘this is a paragraph’, ‘this is a heading’, ‘this is a list’, ‘this is the caption of this figure’, etc). Programs can use this with a style sheet to create output for screen, print, audio, video, Braille, etc.
Some markup languages (e.g. those used in word processors) only describe appearances (‘this is italics’, ‘this is bold’), but this method can only be used for display, and is not normally re-usable for anything else.

Page 1 of 212
eXTReMe Tracker