Unlocking the Web's Full Potential with XML and Directory Services   Table of contents   Indexes   Objects and XML for Next Generation Web Applications

 Canada 
 Sharpe, Bruce  
 SoftQuad Software Inc. 
 Surrey 
 
Bruce Sharpe
 Chief Technology Officer
SoftQuad Software Inc.
 108-10070 King George Highway Surrey British Columbia  Canada (V3T W4)
Email: bsharpe@softquad.com Web site:http://www.softquad.com
 Biography
SoftQuad
 Web site  
 XMetaL 
 
Bruce Sharpe, Chief Technology Officer of SoftQuad Software Inc., holds a Ph.D. in the area of mathematical physics (quantum field theory). He is grateful that the advent of the IBM PC in the early 1980's provided an opportunity to leave this absurdly difficult field. He immediately embarked on a series of leadership roles in software development that has been going on for the last 16 years. His areas of interest have included text processing, database programming, algorithms for digital image processing, music applications, the Internet and document authoring. He has been at SoftQuad since 1996, where he has led the teams that created the most recent versions of HoTMetaL, the HoTMetaL Application Server and XMetaL.
 

XML For Web Site Production

 XML has a number of roles to play on the Web. In this paper, we focus on the use of XML to streamline Web-based content delivery.
 Many Web site creators are realizing that, as challenging as the initial creation of the Web site might be, it is even more difficult to maintain the site and keep it current as content changes. An increasingly common approach for managing this situation is to separate the architecture of the site (how the information is organized and how users navigate through it) from the presentation (what kind of graphics and layout are used) and the content.
 content management 
 
XML is an ideal technology for this kind of Web site production. The general idea is to write the content in XML and store it in some form of content management system. To actually create the HTML, templates are used. The site is generated by pouring the XML content into these templates, which have "hooks" in them indicating where the various bits of XML should go. The result is typically ordinary HTML which can then be delivered to standard browsers. The site generation can be done off-line or on demand, the latter creating personalized Web pages in response to user requests.
 XML offers major benefits in this scenario. Content contributors (the authors) don't need access to the actual Web pages of the site. Nor do they need to understand HTML. XML tools that can be customized for their particular type of information can make the authoring task painless while ensuring that their content has valid structure so that it will fit smoothly into the downstream processing. The webmasters benefit because they can define a site structure once and not have to tinker with it when new content is added or modified. Updating a site becomes a much more automated process, one that does not require resources as expensive as those required to create it in the first place.
 

XMetaL

 COM, Component Object Model 
 XMetaL 
 
XMetaL is a comprehensive XML and SGML content creation tool from SoftQuad Software that is designed so that it can be used by effectively even by non-experts. Without knowing anything about XML or SGML, authors and contributing content creators are able to create valid XML documents. XMetaL is designed to be easy to learn, use, customize and deploy. It offers rich customization capabilities (via a COM-based interface) that include an extensible interface, customizable dialogs, programming scripts and more.
 Since its introduction in June of 1999, XMetaL has been deployed in a wide variety of applications, including:
 
 

Editing Out of the Box

 XMetaL can create documents that conform to arbitrary DTDs, both plain text and compiled. It has three editing views, as shown in the figure below. In Normal view, authors are presented with a familiar word processor- like interface. In Plain Text, experienced users operate in a detailed text oriented view, complete with in-line tags and attributes. And in Tags On view, authors have an intermediate view: a word processor-like view with collapsible tags for immediate access to all elements.
 
 Three Editing Views
 The context-sensitive Attribute Inspector and Element List (see figure below) are available in all views to show valid markup options at the current point in the document. All attribute values can be viewed and set in the Attribute Inspector. In Tags On view, tag tips display all attribute values when the mouse hovers over a tag.
 The Element List lets authors insert or change elements and is fully context-sensitive.
 
 Attribute Inspector and Element List
 XMetaL uses Cascading Style Sheets (CSS) for styling the display, and includes a comprehensive CSS editor to create and modify formatting options.
 XMetaL creates documents that conform to any XML and SGML DTDs and provides real time rules checking during the authoring process. When an error in the markup is found, XMetaL automatically takes the user to the source of the error for correction.
 

Asset Manager

asset manager
 
XMetaL's Asset Manager is an extensible, drag-and-drop object management system that allows users to easily manage boilerplate text, images, document fragments, logos, macros and more, whether they're on a local disk, a network or the Internet.
 
 Asset Manager
 

Customizing XMetaL

 

The Need for Customization

 XML is a very flexible standard and the names and meaning of the tags being used will vary widely within and across businesses. XMetaL is designed to address that flexibility. Authors who are knowledgeable about XML can use the out-of-the-box editing features described above, but typically there are many content contributors in your organization who do not know much about XML, and don't need to know. You can customize XMetaL for them to create an interface that is precisely matched to the nature of the content being created.
 

Customization Philosophy

 Because the users of the customization cannot be assumed to be familiar with XML, they should not have to work directly with tags, attributes, etc. Instead, they should be able to work exclusively in Normal View. Writers should not need to use the Element List or Attribute Inspector.
 Fortunately, XMetaL offers a rich selection of customization capabilities that are easy to use and enable highly productive interfaces. XMetaL also has several built-in heuristics and editing behaviors that give the customizer a head start.
 A good customization is one which provides good visual feedback to the writer as to where he is in the document structure, guides him as to what to do next and does not present him with options that are not valid in the current context.
 

Customization Capabilities

 This section presents a brief summary of the categories of customization capabilities in XMetaL and how they are used.
 
  •  Normal View
     Normal View provides an editing interface much like a word processor. The writer can put the cursor just about anywhere, start typing and something reasonable will happen.
  •  The Style Element List and the Enter Key
     In a word processor you finish one block (paragraph-like) style element and start a new one by pressing the Enter key. The style that follows the current style can be customized. To change from one style to another, the writer selects from a drop-down list on the toolbar.
     XMetaL behaves in a very similar fashion. What is called a "style" in a word processor corresponds to a block XML element. In XMetaL, when a user presses the Enter key, a new block element is started. The customizer has control over which element that is, and a judicious choice can help guide the user as to what to do next.
     If the user wants to change from the element supplied by default, he can choose an alternative from a drop-down list (the Style Element list) on the toolbar. The choices presented to the user at the point are only those which are allowed by the DTD. They can be further filtered by the customizer so a smaller, potentially less intimidating, set of choices will be presented.
     The Enter key has one other important role in XMetaL that builds on another characteristic of a word processor. In a word processor, if you want to finish a list, you press Enter twice. The first Enter creates an empty list item, the second Enter changes that list item into a paragraph. XMetaL extends this concept to solve the general problem: how does the user move "up" the tree of nested XML elements?
     The answer is, "Press Enter". For example, if the DTD enables a structure like this:
     
    Section 
      SubSection1 SubSection2   Paragraph
     and the user is in theParagraph , by successively pressing Enter, the user will move up the tree, going toSubSection2 ,SubSection1 , and finallySection .
     These then are the two basic mechanisms for a user to work with elements at different levels: the Style Element list is used to select a child element, the Enter key is used to move up to a parent element.
  •  Styles
     CSS, Cascading Style Sheets 
    Cascading Style Sheets
     
    XMetaL supports the use of CSS to give the customizer control over the display of the document. (The word "style" in the CSS sense should not be confused with "style" as used in the discussion of word processors in the previous section.) Properly chosen styles can be very helpful to indicate the current context to the writer.
     Remember that styles are there for the presentation of the document during an editing session. They do not necessarily need to correspond to any particular kind of output format.
     Styles have an influence over editing behavior too, in one important respect: block and in-line element types. When XMetaL encounters a DTD for the first time, it makes a guess at which elements should be block and which should be in-line. These choices can be changed at any time by the customizer.
     Block elements participate in more editing behaviors than do in-line elements. Some examples: only block elements will appear in the Style Element list; only block elements can have a "followed by" element; the notion of using Enter to move up the tree only applies to block elements.
     CSS can also be used to generate text which can provide valuable feedback to the user.
  •  Templates
     Templates are documents that provide a starting point for the user. They can be used to give an outline of the structure desired in a document and hence provide valuable guidance on how to create it.
     Replaceable text can be used to provide more detailed guidance. Sections of the template can be designated as read-only to prevent the user from changing them.
  •  Treat As
     XMetaL has a lot of sophisticated editing behavior that is predefined for elements that are like lists and images. To tap into this, it is just a matter of identifying which elements are lists and which are images.
     You can also tell XMetaL to treat certain elements as paragraphs. This is an indication to XMetaL that this element is a good place for the user to type text. You can even indicate the order of importance of paragraph-like elements. This information is used when XMetaL has to decide what element to put in next to create an editable area.
  •  Events
     Several editing events are exposed so that you can handle various user actions. One of the most important of these is the event that an element is being inserted. You have the opportunity then to insert a "mini-template" of markup or run a script. This is a powerful way to insert default content, prompt the user for information or set up special attributes (like IDs).
     Other events result in specially named macros being called.On_Document_Open_Complete is one example. This gives you the opportunity to do any special setup required for that document.
     One other important event is that the user has changed the editing context. This results in theOn_Document_UI macro being called. This is the time to disable any toolbar buttons, menu items and macros that are not currently valid. This is a very important thing to do, because it prevents the user from being frustrated by taking apparently valid actions that can only result in error messages.
  •  Replaceable Text
     Replaceable Text refers to sections of text that appear with a gray background in the document. When the user moves their selection there, the entire section of text is highlighted and is replaced by the user's text as soon as the first key is pressed. The text that is displayed in the replaceable text area is up to the customizer, providing another way to give the user guidance.
  •  Read-only Elements
     Any element in a document can be marked read-only through scripting. This prevents the user from changing parts of a document that are intended to be fixed.
  •  Toolbars
     All the toolbars in XMetaL are customizable. New toolbars can be created and any macro can be associated with a toolbar button.
     Just as importantly, any toolbar button can be removed, so if there is some functionality you don't want your users exposed to, you can prevent it from being displayed.
  •  Menus
     Menus have customization properties similar to toolbars. They can be created or removed and any macro can be associated with a menu item.
  •  Disable Plain Text View
     Some important customizations are provided to prevent users from getting at certain options. A good example of this is plain text view. In that view, ongoing validation is disabled and it is possible for an unsophisticated user to inadvertently create invalid content. If you are concerned about such a situation arising, it is possible to disable plain text view altogether.
  •  JavaScript 
     Perl 
     Python 
     VBScript 
    Windows Scripting Host
     
    Scripts
     XMetaL exposes over 300 COM interfaces, all available through scripting. XMetaL is a Windows Scripting Host, so you have your choice of languages. Out of the box, JavaScript and VBScript are supported, but script engines for Perl, Python, Tcl and others are also available.
  •  Dialogs
     You can create dialogs to provide more sophisticated and specialized user interfaces for your DTD. This is especially useful for form-like portions of the DTD. Dialogs are created in any external programming language that can target COM, such as C++, Visual Basic, and Java.
  •  Other DLLs
     Any COM DLL that you create can be invoked from XMetaL. This can be useful for interfacing to databases, etc.
  •  Context-Sensitive Help
     XMetaL provides the ability to invoke help topics from a script. And since you can determine (through scripting) the current context of the user, it is possible to provide precisely targeted help.
  •  Asset Manager
     The Asset Manager is a fully customizable way of letting users select and execute scripts in a drag-and-drop fashion. At its simplest, it provides a way to collect together clipart and boilerplate text. At the other end of the spectrum, it can host ActiveX controls, for example to provide a browsing interface into a document management system.
content management systems
document management systems
 

Integration with Document Management Systems

 XMetaL offers complete control over file events, which enables integration with databases and document and content management systems. One way to do this is to add a repository browser to the Asset Manager (see figure below).
 
 Content Management System Integration
 POET 
 
The figure above shows how the POET CMS 2.0 content repository might be viewed from the Asset Manager. From this view, documents can be checked in or out, dropped into XMetaL for editing, and composed from components in the repository.
4i
 Astoria 
CMS
Chrystal
 Documentum 
Object Design
 POET 
 StoryServer 
 Vignette 
eXcelon
 
XMetaL has been successfully integrated in a similar fashion with Object Design's eXcelon, Chrystal's Astoria, POET CMS 2.0, Vignette StoryServer and Documentum (including the new 4i document management system). Development with other integration partners is ongoing.
 

Usage Scenarios

 In the following, brief descriptions are provided for two real-world deployments of XMetaL for creating XML content for Web delivery.
Communicate.com
 e-zine 
 

E-zine Publishing: Communicate.com's XML Auto Publishing Engine

 Creating and managing an e-zine does not have to be the labor-intensive nightmare it used to be. The Auto Publishing Engine (APE), a content management tool recently developed by Communicate.com combines a smart file management system with XML that can be converted to HTML for Web site construction. Used together these tools create a structure for formatting and organizing documents, creating links to other sites and printing or emailing documents to potential customers.
 Because of the similarities between e-zine-type sites, Communicate.com's APE can be easily adapted to a number of diverse Web sites. In the following, we will describe the features of this application as they are used by an existing client and show how these specific features can be adapted to other Web sites that wish to provide an e-zine section for their customers.
 

How It Works

 Two separate tools are used together in the Auto Publishing Engine: SoftQuad's XMetaL, which writes a document in XML that tags document components for later manipulation and style definition, and Communicate.com's PERL-based Content Management System, which parses (or translates) XML into HTML for the Web site.
 Here are the steps in the process:
 
  1.  Type text directly into the XMetaL template, or format raw text imported from another application (Microsoft Word, for example)
     
     XMetaL template shows where to place the basic components of an article
  •  The template formats the articles in the e-zine so they have a consistent appearance. The template shows how to organize the article - where headings go, how many headings are allowed and where to place the standard components of an article, such as title, introduction and summary. Once the components of an article are tagged, attributes or presentation style (such as colors, fonts and sizes) can be changed globally.
  •  Articles are divided into separate files to allow search engine optimization, but are linked together for presentation on the Web site by Communicate.com's APE.
  •  The actual Web site is still HTML. Ad banners exist outside of the articles as straight HTML components
  •  XMetaL lets you label chunks of text in each articles - either as regular text, or as structural elements, like headings, introductions and summaries - in ways that are compatible with the way Internet search engines will work in the future.
  •  XMetaL allows emphasis on certain parts of the article (such as tips, warnings or quotations). These may have a different appearance in the article, and are also tagged so that you can conduct a specific search for them later. This is also compatible with the Internet search engine optimization described above.
  •  Search engines will find these articles more efficiently because they can look for specific tags inserted into the document, instead of wading through body copy to search for key words.
  •  XMetaL creates links to external sites or anywhere on your Web site.
  •  XMetaL creates links to graphics files that can appear as thumbnail pictures, and zoom to full size with a mouse click. These graphic files also contain captions.
  •  XMetaL provides internal linking in the article to the author's name and bio sheet, which is created when the writer formats the article. Document details, along with copyright information, are also included with each XMetaL article, allowing for easy editorial manipulation. Author links also allow access to lists of other articles in the library by the same author.
  •  

    The XML Parsing and Publishing Engine

     Most e-zines, regardless of their content, have two characteristics in common. They are a combination of raw content and organizing principles. They contain documents (which may also contain pictures, graphics, or, on a Web site, even video and sound) and they are organized according to a few basic categories. These are often subject, author and title, but any of these categories could be replaced with others, such as date, product, price range or even color.
     Communicate.com's XML Parsing and Publishing Engine takes advantage of the structure and component labelling embedded in the XMetaL document, and manages the appearance and presentation of the magazine in a consistent way. The basis for this is an intelligent file management system.
     At the heart of the XML Parsing and Publishing Engine is a file-naming system that allows the tool to manipulate files according to their designated function as well as meaningful archiving of files. Before any of the XMetaL files that make up an article are published in the Web magazine, they are first named in Windows Explorer according to a strictly defined convention. The name of a file tells the XML Publishing Engine how the file is to be used by the magazine. For example, graphics files are associated with the article in which they appear and are designated as thumbnail or full sized. The filenames for articles tell the program in what order to link the component parts, and links them to associated files, such as sidebars. To perform these functions, the XML Publishing Engine contains three components tools - the Article Manager, Feature Manager and Setup Option.
     MS Word 
     Microsoft Word 
    Word
     conversion 
     import 
     

    Import From MS Word

     Some customizations were developed for Communicate.com's system that allows for content to imported into XML from Word. Most of the functionality is available from the toolbar shown here:
     
     Once a Word document is selected, it is brought into an XML template as basically flat text, with the different Word sections imported as paragraphs. The tagging process is completed by a point and click method. This is used to identify summary and conclusion sections, and generally to add the section and subsection structure to the content.
     The requirements for this application were to discard any existing Word style information, because it was considered too unreliable and any attempt to perform an automatic conversion would require an inordinate amount of manual undoing.
     StoryServer 
    TipWorld
    TipWorld.com
     Vignette 
    remote editing
     

    E-mail Newsletters: TipWorld.com

     TipWorld publishes advice, news and learning by e-mail. Each business day, TipWorld delivers over 3.5 million newsletters to subscribers in 170 countries around the globe. TipWorld is developed and produced by the Online Services Group of PC World Communications, Inc.
     When TipWorld wanted to streamline their production process and enable it to scale up, they turned to XML. Content is produced in XMetaL and stored in Vignette's StoryServer from which it is published via e-mail in either plain text of HTML format, depending on user preference.
     It was an important requirement that the XML content could be worked on by editors at remote locations. To accommodate this, XMetaL's HTTP capabilities were employed. Editors browse to a page in an ordinary Web browser and select the tip to work on. By clicking on a link, XMetaL is invoked and the article, along with any updates to the DTD and customization files are downloaded via HTTP. When the editing work is complete, an XMetaL command is invoked to submit the article back to the Vignette server using an HTTP POST command.
     
     Remote Editing and Configuration
     This mechanism achieves two important goals:
     
    1.  The DTD and related files can be centrally managed for a distributed group. Updates to publishing process become immediately available.
    2.  Remote authors and editors can create valid XML with as much sophistication as required and enter their work into a content management system via HTTP with a one-button submit process.
     

    Summary

     XML is rapidly gaining ground as the "right" way to manage Web content. There is broad support among tool vendors and the standards are maturing. And now there is an emerging body of real world experience that supports the reality of the promise of XML.

    Unlocking the Web's Full Potential with XML and Directory Services   Table of contents   Indexes   Objects and XML for Next Generation Web Applications