Site Tools


notes:xml_schema_cheat_sheet

XML Schema

  • XML Schema is an XML-based alternative to DTDs. It is the successor to DTDs because it is richer and more extensible. It describes the structure of an XML document.
  • The XML Schema language is also referred to as XML Schema Definition (XSD).
  • AN XML Schema defines which elements can appear in an XML document, what their order and relationships are and how many of them there are. It also defines data types and fixed and default values for elements and attributes.
  • A simple XML Schema :
    <?xml version="1.0"?>
    <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
    targetNamespace="http://a.com" xmlns="http://a.com"
    elementFormDefault="qualified">
    
      <xs:element name="note">
        <xs:complexType>
          <xs:sequence>
    	<xs:element name="to" type="xs:string"/>
    	<xs:element name="from" type="xs:string"/>
          </xs:sequence>
        </xs:complexType>
      </xs:element>
    
    </xs:schema> 
  • A reference to this schema in a note XML document would look like
    <note xmlns="http://www.w3schools.com"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:schemaLocation="http://a.com note.xsd"> 

Simple Types

  • A simple element contains only text, no other elements or attributes. But the text may be any of the XSD types or a custom type and may have restrictions on it. e.g.
     <xs:element name="start_date" type="xs:date"/>
  • Simple XSD types include xs:string, xs:decimal, xs:integer, xs:boolean, xs:date, xs:time.
  • An attribute is always a simple type (even though a simple element can't have any attributes) e.g.
    <xs:attribute name="start_date" type="xs:date"/>
  • A simple element or an attribute can also have a default or fixed value e.g. default=“red” or fixed=“red” .
  • By default an attribute is optional, to make it required add use=“required” .

Facets / Restrictions

  • XML Facets are restrictions on the acceptable values for elements or attributes.
  • A simple example which can be used to restrict age to be between 0 and 120 inclusive. This example inlines the type :
    <xs:element name="age">
      <xs:simpleType>
        <xs:restriction base="xs:integer">
          <xs:minInclusive value="0"/>
          <xs:maxInclusive value="120"/>
        </xs:restriction>
      </xs:simpleType>
    </xs:element> 
  • Another example to restrict the model of cars to be one from a list. This example creates the type with a name so it can be used by multiple elements.
    <xs:element name="car" type="carType"/>
    
    <xs:simpleType name="carType">
      <xs:restriction base="xs:string">
        <xs:enumeration value="Audi"/>
        <xs:enumeration value="Golf"/>
        <xs:enumeration value="BMW"/>
      </xs:restriction>
    </xs:simpleType>
  • Restriction of a string to a specific regexp pattern (> 0 lowercase letters):
      <xs:restriction base="xs:string">
        <xs:pattern value="([a-z])+"/>
      </xs:restriction>
  • Whitespace restrictions .
      <xs:restriction base="xs:string">
        <xs:whiteSpace value="preserve"/>
      </xs:restriction>
    • preserve means to leave whitespace alone
    • replace means to replace all whitespace characters with spaces
    • collapse means to collapse all whitespace sequences to a single space.
  • A list of all possible restrictions
Restriction Description
enumeration Defines a list of acceptable values
fractionDigits Specifies the maximum number of decimal places allowed. Must be equal to or greater than zero
length Specifies the exact number of characters or list items allowed. Must be equal to or greater than zero
maxExclusive Specifies the upper bounds for numeric values (the value must be less than this value)
maxInclusive Specifies the upper bounds for numeric values (the value must be less than or equal to this value)
maxLength Specifies the maximum number of characters or list items allowed. Must be equal to or greater than zero
minExclusive Specifies the lower bounds for numeric values (the value must be greater than this value)
minInclusive Specifies the lower bounds for numeric values (the value must be greater than or equal to this value)
minLength Specifies the minimum number of characters or list items allowed. Must be equal to or greater than zero
pattern Defines the exact sequence of characters that are acceptable
totalDigits Specifies the exact number of digits allowed. Must be greater than zero
whiteSpace Specifies how white space (line feeds, tabs, spaces, and carriage returns) is handled

Complex Elements

  • Complex elements are elements that are not simple. There are 4 types :
    • empty elements
    • elements that contain only other elements
    • elements that contain only text
    • elements that contain both other elements and text
  • An example of a named complex type that contains only other elements :
    <xs:element name="employee" type="personinfo"/>
    <xs:element name="student" type="personinfo"/>
    
    <xs:complexType name="personinfo">
      <xs:sequence>
        <xs:element name="firstname" type="xs:string"/>
        <xs:element name="lastname" type="xs:string"/>
      </xs:sequence>
    </xs:complexType>

    Complex types can also be inlined as with simple types.

  • Complex types can also extend other complex types, e.g.
    <xs:complexType name="fullpersoninfo">
      <xs:complexContent>
        <xs:extension base="personinfo">
          <xs:sequence>
            <xs:element name="address" type="xs:string"/>
            <xs:element name="city" type="xs:string"/>
            <xs:element name="country" type="xs:string"/>
          </xs:sequence>
        </xs:extension>
      </xs:complexContent>
    </xs:complexType> 
  • Example of an empty element (attributes only) :
    <xs:complexType name="prodtype">
      <xs:attribute name="prodid" type="xs:positiveInteger"/>
    </xs:complexType>
  • Complex text-only elements contain only simple content (text and attributes), so we add a simpleContent element around the content. When using simple content, you must define an extension OR a restriction within the simpleContent element. e.g.
    <xs:complexType name="shoetype">
      <xs:simpleContent>
        <xs:extension base="xs:integer">
          <xs:attribute name="country" type="xs:string" />
        </xs:extension>
      </xs:simpleContent>
    </xs:complexType>
  • In a mixed-content complex-type, character data can appear between the child-elements. This is done by setting mixed to true e.g.
    <xs:complexType name="lettertype" mixed="true">
      <xs:sequence>
        <xs:element name="name" type="xs:string"/>
        <xs:element name="orderid" type="xs:positiveInteger"/>
      </xs:sequence>
    </xs:complexType>

Element Indicators

  • Indicators control how elements are to be used within a complex element. There are three types of indicators: order indicators, occurrence indicators and group indicators
  • Order Indicators include :
    • All - child elements can appear in any order but each must occur exactly once
    • Choice - either of several elements can occur
    • Sequence - elements must occur in order
  • Occurrence Indicators include :
    • maxOccurs - maximum number of times an element can occur or “unbounded” (default is 1)
    • minOccurs - minimum number of times an element can occur (default is 1)
  • There are two types of groups:element groups and attribute groups. Groups define related sets of elements or attributes. Once a group is created it can be referenced elsewhere.
  • Example of an element group. Note that an order indicator must appear within a group element.
    <xs:group name="persongroup">
      <xs:sequence>
        <xs:element name="firstname" type="xs:string"/>
        <xs:element name="lastname" type="xs:string"/>
      </xs:sequence>
    </xs:group>

    A reference to it within a sequence would look like

    <xs:group ref="persongroup"/>

* An Attribute group is similar for example.

<xs:attributeGroup name="personattrgroup">
  <xs:attribute name="firstname" type="xs:string"/>
  <xs:attribute name="lastname" type="xs:string"/>
</xs:attributeGroup>

A reference to the attribute group in a complex type would look like :

 <xs:attributeGroup ref="personattrgroup"/> 
  • Using <xs:any /> as an element allows elements not specified in the schema to occur.
  • Using <xs:anyAttribute/> in a complex type allows use of attributes not specified by the schema.
  • A substitution group allows other elements to substitute for the first. The head elemenet must be a global element (a direct child of the schema element) e.g.
    <xs:element name="name" type="xs:string"/>
    <xs:element name="navn" substitutionGroup="name"/>

    Substitution can be blocked with

    <xs:element name="name" type="xs:string" block="substitution"/>

Data Types

String Data Types

  • Apart from xs:string there are two other string stypes :
    • xs:normalizedString - No CR, LF or TAB characters are allowed
    • xs:token - No CR, LF or TAB characters are allowed, no leading or trailing spaces are allowed, no sequences of more than one space is allowed
  • There are many other types derived from string e.g. NMTOKEN, QName, ID, IDREF. The following restrictions can be used with string types
    • enumeration
    • length
    • maxLength
    • minLength
    • pattern (NMTOKENS, IDREFS, and ENTITIES cannot use this constraint)
    • whiteSpace

Date Data Types

  • date data type is used to specify a date in the format “YYYY-MM-DD” where all components are required.
  • A time data type must be specified in the following format “hh:mm:ss” where all components are required.
  • A dateTime datatype must be specified in the following format “YYYY-MM-DDThh:mm:ss” e.g. 2002-05-30T09:00:00
  • A timezone can be added to a date/time/dateTime by adding a Z (for UTC) or a signed offset at the end e.g. 2002-09-24Z , 2002-09-24+06:00
  • A duration data type must be specified in the following format “[-]PnYnMnDTnHnMnS” P is required, T is required if any time component is used and the other parts are optional e.g. P5Y, P5Y2M10DT15H, -P1Y
  • This is a list of all date types.
Type Description
date Defines a date value (“YYYY-MM-DD”)
dateTime Defines a date and time value (“YYYY-MM-DDThh:mm:ss”)
duration Defines a time interval ([-]PnYnMnDTnHnMnS)
gDay Defines a part of a date - the day (DD)
gMonth Defines a part of a date - the month (MM)
gMonthDay Defines a part of a date - the month and day (MM-DD)
gYear Defines a part of a date - the year (YYYY)
gYearMonth Defines a part of a date - the year and month (YYYY-MM)
time Defines a time value (“hh:mm:ss”)
  • The following restrictions can be used with Date data types:
    • enumeration
    • maxExclusive
    • maxInclusive
    • minExclusive
    • minInclusive
    • pattern
    • whiteSpace

Numeric Data types

Other Data Types

notes/xml_schema_cheat_sheet.txt · Last modified: 2026/06/07 10:00 by 43.110.34.223