Site Tools


notes:xml_cheat_sheet

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
notes:xml_cheat_sheet [2026/06/07 02:43]
114.119.158.251 old revision restored (2026/06/05 06:39)
notes:xml_cheat_sheet [2026/06/07 07:31] (current)
8.209.74.18 old revision restored (2007/10/05 18:53)
Line 1: Line 1:
 ===== XML Cheat Sheet ===== ===== XML Cheat Sheet =====
  
-I stumble across XML documents intermittently and every time I need to review the basics again. This is a cheat sheet so that I can review it whenever I need to. Information here is a summarized form of the [[http://www.w3schools.com/xml/default.asp | XML Tutorial]]+I stumble across XML documents intermittently and every time I need to review the basics again. This is a cheat sheet so that I can review it whenever I need to. This is a summarized form of the [[http://www.w3schools.com/xml/default.asp | XML Tutorial]]
 + 
 +Also see the following related cheat sheets : 
 +  * [[DTD Cheat Sheet]] 
 +  * [[XML Schema Cheat Sheet]] 
 + 
 +For reference this is the [[http://www.w3.org/TR/REC-xml/  | XML Specification]] and the [[http://www.xml.com/axml/testaxml.htm | version annotated by Tim Gray]].
  
 ==== What is XML ==== ==== What is XML ====
  
-E**X**tensible **M**arkup **L**anguage (XML) is a markup language designed to describe data. It has no predefined tags and uses a Document Type Definition (DTD) or an XML Schema to describe the data. An XML document together with its DTD or XML Schema is self-descriptive.+  * E**X**tensible **M**arkup **L**anguage (XML) is a markup language designed to describe data. It has no predefined tags
 + 
 +  * XML uses a Document Type Definition (DTD) or an XML Schema to describe the data. An XML document together with its DTD or XML Schema is self-descriptive
 + 
 +  * XML Schema is the successor to DTD because it is richer and more extensible. 
 + 
 +  * XML uses text files to store data and can be used to create new languages e.g. WAP, WML, XHTML, RSS, SOAP etc. 
 + 
 +  * Because XML documents may contain Unicode characters, they should be saved as Unicode text files. The encoding attribute should be the same as the encoding that the text file is saved as. 
 + 
 +  * XML files are completely platform-independent and portable (EBCDIC platforms ?).
  
-XML uses text files to store data and can be used to create new languages e.g. WAP, WML, XHTML etc. 
  
 ==== XML Syntax ==== ==== XML Syntax ====
  
-A simple XML document :+  * A simple XML document :
  
 <code> <code>
 <?xml version="1.0" encoding="ISO-8859-1"?> <?xml version="1.0" encoding="ISO-8859-1"?>
 +<!DOCTYPE note SYSTEM "InternalNote.dtd">
 <note date="12/11/2002"> <note date="12/11/2002">
-<to>Tove</to> +  <to>Alice</to> 
-<from>Jani</from>+  <from>Bob</from
 +  <par>Hi.</par> 
 +  <par>Bye.</par>
 </note> </note>
 </code> </code>
Line 23: Line 41:
   * The first line is an XML declaration which defines the XML version and the character encoding used in the document.   * The first line is an XML declaration which defines the XML version and the character encoding used in the document.
  
-  * XML tags are case-sensitive and must have a corresponding closing tag. Tags must be properly nested.+  * XML tags are case-sensitive and must have a corresponding closing tag. Empty elements can combine the start and closing tag e.g. <br />. 
 + 
 +  * XML tags must be properly nested.
  
   * An XML document must have a root element (note in the above). All elements may have child elements.   * An XML document must have a root element (note in the above). All elements may have child elements.
Line 32: Line 52:
  
   * <!-- This is a comment -->   * <!-- This is a comment -->
 +
 +==== XML Validation ====
 +
 +  * An XML document which is syntactically correct is described as well-formed.
 +
 +  * An XML parser or application must not try to interpret an XML document that is not well-formed. It must fail if the syntax is incorrect.
 +
 +  * AN XML document which is well-formed and conforms to the rules of a DTD or XML Schema is described as valid.
 +
 +  * A DTD or XML Schema defines the document structure with a list of legal elements and attributes.
 +
  
 ==== XML Elements ==== ==== XML Elements ====
  
-XML Elements can have attributes which must be either single or double quoted e.g. date in the above.+  * An XML element is everything from the start tag to the end tag. 
 + 
 +  * Elements can have attributes (in their start tag) which must be either single or double quoted e.g. date in the above
 + 
 +  * Elements can have either 
 +    * Empty content 
 +    * Simple content(text ony) 
 +    * Element content (child elements) 
 +    * Mixed content (child elements and text) 
 + 
 +  * Elements can be parents, children or siblings of other elements. 
 + 
 +  * Elements must be closed properly and be properly nested. 
 + 
 +  * XML element names can contain any character except for a space but must start with a letter and can't start with xml (in any case). Names shoud not include . : or - . 
 + 
 +  * < and & are illegal in XML elements. Avoiding ' " and > is recommended. These should be replaced by character entities i.e. &lt; &gt; &apos; &quote; &amp; 
 + 
 +  * A CDATA section starts with "<![CDATA[" and ends with "]]>": 
 + 
 +  * Everything inside a CDATA section except for ]]> is permitted. 
 + 
 +==== XML Attributes ==== 
 + 
 +  * XML elements can have attributes in their start tag. 
 + 
 +  * A singly quoted attribute value cannot contain single-quotes. A doubly quoted attribute value cannot contain double-quotes. 
 + 
 +  * Although data can be stored either in child elements and attributes, attributes should really be used for metadata i.e. data about the data which is not part of the data itself. For example an element id is best stored in an attribute. 
 + 
 + 
 +==== XML Namespaces ==== 
 + 
 +  * XML Namespaces is covered as a [[http://www.w3.org/TR/REC-xml-names/ | separate XML recommendation]] . 
 + 
 +  * XML namespaces allow element names from XML documents not to conflict if they mean something different. 
 + 
 +  * A simple example of using a namespace 
 +<code> 
 +<h:table xmlns:h="http://www.w3.org/TR/html4/"> 
 +   <h:tr> 
 +     <h:td>Apples</h:td> 
 +     <h:td>Bananas</h:td> 
 +   </h:tr> 
 +</h:table> 
 +</code> 
 + 
 +  * The xmlns URL is not used but often points to an informational web page. 
 + 
 +  * If the form xmlns="namespaceURI" is used instead then all child elements are automatically in that default namespace. 
 + 
 +==== XML Stylesheets ==== 
 + 
 +  * **C**ascading **S**tyle **S**heets (CSS) are used to display XML by associating styles with element types. 
 + 
 +  * An XML document is associated with a stylesheet using 
 +<code> 
 +<?xml-stylesheet type="text/css" ref="simple.css"?> 
 +</code> 
 + 
 +  * CSS is deprecated in favour of XSL. 
 + 
 +  * **X**ML **S**tylesheet **L**anguage is the preferred formatting language for XML. It is a more sophisticated and powerful replacement for CSS. 
 + 
 +  * An XSL stylesheet can be associated with an XML document using 
 +<code> 
 +<?xml-stylesheet type="text/xsl" href="simple.xsl"?> 
 +</code> 
 + 
 +==== Assorted ==== 
 + 
 +  *  An XML data island is XML data embedded into an HTML page.
  
-XML elements can have either +  * There are also XML parsers and loaders in all modern web browsers.
-  * Empty content +
-  * Simple content(text ony) +
-  * Element content (child elements) +
-  * Mixed content (child elements and text)+
  
-XML element names can contain any character except for a space but must start with a letter and can't start with xml (in any case).+  *  Web browsers can manipulate the XML document using the Document Object Model (DOMwhich treats the XML document as a tree data object. The syntax varies slightly from browser to browser.
  
 +  * A **U**niform **R**esource **I**dentifier (URI) is a string of characters which identifies an Internet Resource. The most common URI is the **U**niform **R**esource **L**ocator (URL) which identifies an Internet domain address. Another, not so common type of URI is the **U**niversal **R**esource **N**ame (URN). 
  
notes/xml_cheat_sheet.txt · Last modified: 2026/06/07 07:31 by 8.209.74.18