Infoloom
Semantic Integration Technologies
|
![]() |
How to read the XTM Document Type Definition?
What is an XTM document?What is a Document Type Definition (DTD)? Elements Content Model Containment rules Attributes XML Features A Guide to the XTM Syntax The XTM DTD The XTM DTD (annotated) XTM Code examples What is an XTM document?An XTM document is an XML document that describes a topic map and its properties. What is a Document Type Definition (DTD)?The XTM DTD expresses the list of components (elements) that are allowed within an XTM document, what they are made of (content model), how they are connected together (containment), and their internal properties (attributes). 19 element types are listed as part of the XTM DTD: their names are topicMap, topic, instanceOf, subjectIdentity, topicRef, subjectIndicatorRef, baseName, baseNameString, variant, variantName, parameters, occurrence, resourceRef, resourceData, association, member, roleSpec, scope and mergeMap. ElementsElements are building blocks for an XML document. The types to which they belong are declared in the DTD as "element declarations" and appear in documents as tags, with angle brackets. For example, here is an element declaration: <!ELEMENT baseNameString (#PCDATA)> The keyword #PCDATA ("Parsed Character Data") is the conventional way to indicate that the content of this element is a string of characters. Thus the XTM document may contain tags such as : <baseNameString>New York Content ModelThe content model of an element declaration describes what the element is made of. In the XTM DTD, there are three possible cases:
Cases # 1 and 2 are indicated by reserved keywords: In the case #1, (#PCDATA), which must always be typed in upper case and must occur between parentheses, indicates that the element must contains characters (possibly none). In the case #2, EMPTY which must always be typed in upper case, indicates that the element does not contain anything. The useful information in this element must be provided by its attribute values (see below). Case # 3 is expressed with a grammar for containment rules (see below). Containment rulesThe group of element types contained in an element appears within parentheses. Groups can be nested within a group. If an element of a given type is allowed to contain more than one element type, then they constitute a list, and they are separated by "sequence indicators". There are two kinds of sequence indicators used in the XTM DTD:
For example, the following element type declaration: <!ELEMENT instanceOf (topicRef | subjectIndicatorRef) > indicates that an element instanceOf can include either a topicRef, or a subjectIndicatorRef element. The DTD also contains ways to indicate that an element or group of elements can occur once only, one or more times, optionally, or any number of times including zero. The notation used to describe these properties is as follows:
For example, the following element declaration: <!ELEMENT baseName
(scope?, baseNameString, variant*)> indicates that a baseName element may contain a scope element (it is optional because of the question mark), a baseNameString element (which must be present no matter what, and there can be only one such element, because of the absence of indicator), and it may or may not contain any number of variant elements (thanks to the star sign). In the following example, the content model is expressed as an "OR" group, to which an occurrence indicator is applied: <!ELEMENT scope
(topicRef |
resourceRef |
subjectIndicatorRef)+
> This element declaration for scope reads: A scope element contains any number, but at least one, of elements which can be of any of the three types listed, repeated and/or in any order: topicRef, resourceRef and subjectIndicatorRef. AttributesAttributes are properties that serve to further differentiate elements. Attributes are declared within an attribute list introduced by the expression <!ATTLIST. Attributes refer to a given element, and are declared, in the XTM DTD, immediately after the element declaration (this is not a requirement, and certain DTDs prefer to declare the attribute values in a different location. An ATTLIST declaration is made of declarations of individual attributes. Each individual attribute declaration is a set of three fields: the attribute name, a keyword which expresses the type of attribute it is, and a default declaration. The attribute names used in the XTM DTD are: xmlns, xmlns:xlink, xml:base, id, xlink:type, and xlink:href. Three attribute types are used in the XTM DTD: ID, CDATA, NMTOKEN
The attribute default declaration indicates whether the attribute is required or not, fixed or not. The keyword #REQUIRED indicates that an attribute value must be provided in the document. The keyword #FIXED followed by a value indicates that this attribute is pre-defined for all documents, and may not be redefined in a document. The keyword #IMPLIED means that a value may be provided in a document, but that the application knows how to handle the case where no attribute is provided, in other words, that this attribute is optional. XML Features
|
© 2005, Michel Biezunski |