XML INTRODUCTION

 


  1. INTRODUCTION
  2. XML TREE
  3. XML SYNTAX
  4. XML ELEMENTS
  5. XML ATTRIBUTES
  6. XML NAME SPACES 
  7. XML DISPLAY
  8. XML PARSER

1. INTRODUCTION

XML stands for Extensible Markup Language. It is a text-based markup language derived from Standard Generalized Markup Language (SGML).
 XML tags identify the data and are used to store and organize the data, rather than specifying how to display it like HTML tags, which are used to display the data. XML is not going to replace HTML in the near future, but it introduces new possibilities by adopting many successful features of HTML. 
There are three important characteristics of XML that make it useful in a variety of systems and solutions:
  1. XML is extensible: XML allows you to create your own self-descriptive tags or language, that suits your application.
  2. XML carries the data, does not present it: XML allows you to store the data irrespective of how it will be presented. 
  3. XML is a public standard: XML was developed by an organization called the World Wide Web Consortium (W3C) and is available as an open standard

2. XML TREE


XML Tree structure

DOM node tree

XML Tree Structure

  • XML documents are formed as element trees.
  • An XML tree starts at a root element and branches from the root to child elements.
  • All elements can have sub elements (child elements):

<root> 

        <child>

            <subchild>.........</subchild>

        </child>

</root>

  • The terms parent, child, and sibling are used to describe the relationships between elements.
  • Parents have children. Children have parents. Siblings are children on the same level (brothers and sisters).


3. XML SYNTAX RULES

XML Documents Must Have a Root Element

XML documents must contain one root element that is the parent of all other elements:

<root> 

        <child>

            <subchild>.........</subchild>

        </child>

</root>

In this example <root> is the root element:

The XML Prolog

This line is called the XML prolog:

<?xml verson="1.0" encoding="UFT-8"?>

  • The XML prolog is optional. If it exists, it must come first in the document.
  • XML documents can contain international characters, like Norwegian øæå or French êèé.
  • To avoid errors, you should specify the encoding used, or save your XML files as UTF-8.
  • UTF-8 is the default character encoding for XML documents.

ALL XML Elements Closing Tag

In XML, it is illegal to omit the closing tag. All elements must have a closing tag:

XML Tags are Case Sensitive

XML tags are case sensitive. The tag <Letter> is different from the tag <letter>.

Opening and closing tags must be written with the same case:

<p> closing tag</p>

XML Elements Nested

In HTML, you might see improperly nested elements:

<b><i> This text is bold and italic</b></i>

XML Attribute Values Must Always be Quoted

XML elements can have attributes in name/value pairs just like in HTML.

In XML, the attribute values must always be quoted.

4. XML ELEMENTS

What is an XML Element?

An XML element is everything from (including) the element's start tag to (including) the element's end tag.

An element can contain:

  • text
  • attributes
  • other elements
  • or a mix of the above

<bookstore>

  <book category="children">
    <title>Harry Potter</title>
    <author>J K. Rowling</author>
    <year>2005</year>
    <price>29.99</price>
  </book>
  <book category="web">
    <title>Learning XML</title>
    <author>Erik T. Ray</author>
    <year>2003</year>
    <price>39.95</price>
  </book>

</bookstore>


XML Naming Rules

XML elements must follow these naming rules:

  • Element names are case-sensitive
  • Element names must start with a letter or underscore
  • Element names cannot start with the letters xml (or XML, or Xml, etc)
  • Element names can contain letters, digits, hyphens, underscores, and periods
  • Element names cannot contain spaces

Best Naming Practices

Create descriptive names, like this: <person>, <firstname>, <lastname>.

Create short and simple names, like this: <book_title> not like this: <the_title_of_the_book>.

Avoid "-". If you name something "first-name", some software may think you want to subtract "name" from "first".

Avoid ".". If you name something "first.name", some software may think that "name" is a property of the object "first".

Avoid ":". Colons are reserved for namespaces (more later).

Non-English letters like éòá are perfectly legal in XML, but watch out for problems if your software doesn't support them!

5. XML ATTRIBUTES

XML elements can have attributes, just like HTML.

Attributes are designed to contain data related to a specific element.

XML Attributes Must be Quoted

Attribute values must always be quoted. Either single or double quotes can be used.

For a person's gender, the <person> 

<person gender="female">

Avoid XML Attributes?

Some things to consider when using attributes are:

  • attributes cannot contain multiple values (elements can)
  • attributes cannot contain tree structures (elements can)
  • attributes are not easily expandable (for future changes)

<note day="10" month="01" year="2008"

to="Tove" from="Jani" heading="Reminder"
body="Don't forget me this weekend!"
>

</note>


XML Attributes for Metadata

Sometimes ID references are assigned to elements. These IDs can be used to identify XML elements in much the same way as the id attribute in HTML. This example demonstrates this:

<messages>

  <note id="501">
    <to>Tove</to>
    <from>Jani</from>
    <heading>Reminder</heading>
    <body>Don't forget me this weekend!</body>
  </note>
  <note id="502">
    <to>Jani</to>
    <from>Tove</from>
    <heading>Re: Reminder</heading>
    <body>I will not</body>
  </note>

</messages>


6. XML NAME SPACES

XML Namespaces provide a method to avoid element name conflicts.

Name Conflicts

In XML, element names are defined by the developer. This often results in a conflict when trying to mix XML documents from different XML applications.

This XML carries HTML table information:

<table>

  <name>African Coffee Table</name>
  <width>80</width>
  <length>120</length>

</table>


7. XML DISPLAY

Raw XML files can be viewed in all major browsers.

Don't expect XML files to be displayed as HTML pages.


Most browsers will display an XML document with color-coded elements.

Often a plus (+) or minus sign (-) to the left of the elements can be clicked to expand or collapse the element structure.

To view raw XML source, try to select "View Page Source" or "View Source" from the browser menu.

Other XML Examples

Viewing some XML documents will help you get the XML feeling:

An XML breakfast menu
This is a breakfast food menu from a restaurant, stored as XML.

An XML CD catalog
This is a CD collection, stored as XML.

An XML plant catalog
This is a plant catalog from a plant shop, stored as XML.

Note: In Safari 5 (and earlier), only the element text will be displayed. To view the raw XML, you must right click the page and select "View Source".

Why Does XML Display Like This?

XML documents do not carry information about how to display the data.

Since XML tags are "invented" by the author of the XML document, browsers do not know if a tag like <table> describes an HTML table or a dining table.

Without any information about how to display the data, the browsers can just display the XML document as it is.

8. XML PARSER


All major browsers have a built-in XML parser to access and manipulate XML.

XML Parser

The XML DOM (Document Object Model) defines the properties and methods for accessing and editing XML.

However, before an XML document can be accessed, it must be loaded into an XML DOM object.

All modern browsers have a built-in XML parser that can convert text into an XML DOM object.

Parsing a Text String

This example parses a text string into an XML DOM object, and extracts the info from it with JavaScript:

<html>
<body>

<p id="demo"></p>

<script>
var text, parser, xmlDoc;

text = "<bookstore><book>" +
"<title>Everyday Italian</title>" +
"<author>Giada De Laurentiis</author>" +
"<year>2005</year>" +
"</book></bookstore>";

parser = new DOMParser();
xmlDoc = parser.parseFromString(text,"text/xml");

document.getElementById("demo").innerHTML =
xmlDoc.getElementsByTagName("title")[0].childNodes[0].nodeValue;
</script>

</body>
</html>


Example Explained

A text string is defined:

text = "<bookstore><book>" +

"<title>Everyday Italian</title>" +
"<author>Giada De Laurentiis</author>" +
"<year>2005</year>" +

"</book></bookstore>";

The parser creates a new XML DOM object using the text string:

xmlDoc = parser.parseFromString(text,"text/xml");

The XMLHttpRequest Object

The XMLHttpRequest Object has a built in XML Parser.

The responseText property returns the response as a string.

The responseXML property returns the response as an XML DOM object.

If you want to use the response as an XML DOM object, you can use the responseXML property.

xmlDoc = xmlhttp.responseXML;

txt = "";
x = xmlDoc.getElementsByTagName("ARTIST");
for (i = 0; i < x.length; i++) {
    txt += x[i].childNodes[0].nodeValue + "<br>";
}
document.getElementById("demo").innerHTML = txt;