The WISDOM of working on the Web   Table of contents   Indexes   Information Management - Who gets the benefits

 
 

XML is not just another name for SGML. XML is the vehicle to deploy structured data systems throughout an organization


 
Michael   Maziarka
  Director, Product Marketing, Information Management Products
  Xyvision, Inc.
30 New Crossing Road
Reading   Massachusetts  01867  USA
Phone: +1 781 756-4400 x5138
Fax: +1 781 756-4330
Email: maz@commat;xyvision.com Web: http://www.xyvision.com
 
Biographical notice:
 
Michael Maziarka
 
Michael Maziarka is the Director of Product Marketing for information management products at Xyvision, Inc. In this role, Mike is responsible for product planning, pricing, packaging, and promotion for Xyvision's Parlance Document Manager and WebPorter products. In addition to this role, he holds the Chair position on the Board of Directors of OASIS, the Organization for the Advancement of Structured Information Standards. He has also been active in the development of the CALS electronic publishing standards, and has participated in J2008 and ANSI committee work.
 
Prior to joining Xyvision, Mike held several corporate and product marketing positions at the Datalogics Division of Frame Technology. Mike received an MBA from DePaul University and a BS in computer engineering from Iowa State University.
 
ABSTRACT:
 
Is XML merely a new packaging of SGML, with a new name to make it more palatable for the masses? Or does XML provide a means to implement a structured environment without the cost and training required to deploy traditional SGML systems? After all, a DTD is not required with XML; unless of course, your application requires one. This statement typically leads to the conclusion that use of XML falls into two arenas: valid (an SGML application) and well-formed (basic structured document support). This presentation proposes that there will not be two sets of XML applications, tools, and implementations but rather one heterogeneous environment where entire corporations share data that is a mix of valid and well-formed XML.
 
Valid XML requires that document structures be predefined whereas well-formed XML provides an adhoc approach to creating and managing structured documents. The advantages to predefining document structures are that data components can be collected and built into larger documents based upon defined rules and that application software can be programmed to anticipate how to process data based upon its classification or type. The disadvantages to predefining structures are that initial investments must be made in defining and agreeing upon the structures and that data creators and editors must be trained to work within those constraints.
 
The advantages to an adhoc approach provided by well-formed XML data are that a significant part of the cost associated with implementing structure-based systems is eliminated and that data authors have more freedom to create structures to fit their information. The disadvantages to adhoc structures are that post-processing of the data becomes more difficult and that the data becomes less reusable within a larger context.
 
Clearly there are cases for both types of usage within organizations. As an example, a manufacturer needs to create an assortment of information in support of a product: documentation, training materials, marketing materials, proposals, support documents, etc. From this list, you can quickly start to sort the information into valid and well-formed applications. However, within this organization, wouldn't the data creators want to share data components between all of these different types of information? If so, doesn't that require that all applications use predefined structures to enable reusability?
 
To completely share information, valid XML is most likely required. However requiring all applications to use valid XML is not practical for many of the reasons that SGML has not been implemented for all applications today. Instead, XML systems of the future will manage an assortment of valid and well-formed data components. The systems will track and understand the uses of the data, and assist users in sharing components.
 
The XML system of the future will permit users to create data components using a series of templates. Templates will exist for all types of corporate information (just as styles sheets, templates, FOSIs, and EDDs today). Depending upon the final use of that information, the XML system will track whether the data components meet a predefined structure necessary for some uses, or is simply well-formed. The system will launch the appropriate tools for editing the data components based upon the context in which the information will be used. Information that is created as valid or well-formed which must later be shared within the reverse context will cause the system to prohibit the user from sharing the data without making modifications, or will automatically create duplicate copies. In either case, the system will track all data component relationships without users needing to specify which type of information they are creating. Users will only need to specify the use of the information. In addition, the XML systems will track style information and links between the data components.
 
 

The Success of SGML

 
SGML had been successfully deployed as an integral part of publishing solutions for nearly the past fifteen years. It isn't difficult to find testimonials in the aerospace, automotive, computing technology, defense, government, manufacturing, publishing, and telecommunications industries which boast of increased productivity, reduced costs, and improved quality of products. However, when looking at how SGML has been used within those industries, it has primarily been for publishing applications. To refine that list even further, frequently it has been used only for large volume or time-critical publishing applications.
 
Does SGML only have applicability for high-volume and complex applications? If you look at the benefits derived from SGML systems, you can quickly draw the conclusion that corporations as a whole strive towards those same goals. However, SGML is often ruled out as a possible solution due to the high costs associated with deploying the tools and systems. When looking at those costs, they primarily fall into three areas:
  • SGML products
  • Analysis and implementation costs
  • Training and on-going support
 
Although none of these costs can be dismissed, typically it is the training and implementation costs that are the biggest obstacle to implementing SGML. It is believed that products follow the standard economic model of supply and demand. Today SGML products are more expensive than typical desktop tools. It would be reasonable to expect that if demand grew, more products would be introduced into the market, thus lowering costs.
 
It is not only the cost of implementation that has hindered the acceptance of SGML. In addition, many organizations are still departmentally driven today. Each function within an organization creates and disseminates its own information. To that end, each department selects tools that will best meet their needs, within an independent budget. Typically, the focus is on the end deliverable, rather than the data or information that is being created . As a result, corporations own a range of tools and systems which are largely incompatible.
 
For organizations which have disparate systems, sharing information is a painful exercise. To find information, employees pass messages and communicate informally. Even in environments where search engines and document management systems have been deployed, sharing or reusing information often takes the form of cut and paste or re-keying data. These mechanisms for reuse are used because data is still being created in document form, rather than as information components. Once pasted or re-keyed, the information sharing ends at that point. If the original data changes, the change is not reflected in any subsequently used contexts.
 
 

The Promise of XML

 
Corporations are being driven to increase their competitive positions in today's global markets. Bringing products to market faster, in all markets, and with all the supporting documents and materials is becoming a global goal. In addition to product differentiation and time to market as a means to compete, corporations are focussing more on their customers. Better support and information tuned for individual consumers are becoming an absolute requirement to staying ahead of the competition.
 
While enhancing product offerings, companies are also looking for ways to reduce costs. Improving efficiency is a consistent goal. Terms like knowledge management are becoming common place in corporations as they look to better leverage their corporate information assets. To achieve that goal, information must be accessible and in a form that can be easily shared and reused.
 
These corporate goals are very similar to the goals and objectives of publishing organizations that have implemented SGML. Therein lays the promise of XML. Conceptually, XML is a lighter weight version of SGML that promises to be globally accepted and adopted. New tools will be introduced into the market that make structured information common place in corporations. The neutral encoding will make the data transferable between applications, and the structure will enable the information to be accessed and maintained as information components rather than documents. As with the case of SGML, documents can be built by collecting and then publishing the components.
 
As previously stated, one of the primary reasons for the lack of adoption of SGML has been the implementation, training, and support costs associated with the products and systems supporting it. If that is the case, then how will XML reduce those costs?
 
Besides eliminating some of the more complicated SGML syntax and features, it also introduces the notion of a well-formed document. A well-formed document is one that has properly nested structures and need not reference or follow a Document Type Definition (DTD). Instances that parse which parse and use a DTD, as with the case of an SGML instance, are called valid XML documents.
 
It is the concept of a well-formed instance that will enable XML to become widely deployed. Well-formed information will not require the advance definition required for SGML or valid XML documents. Nor will well-formed XML applications require the training or support associated with SGML systems.
 
The spread and adoption of well-formed XML applications will not replace the need for DTDs. Applications which require publishing collections of information, making decisions based upon the value or presence of certain data, or transformations of the data for other purposes will still need to have a DTD. As XML becomes more prominent, many corporations will determine that many applications could benefit from pre-defining the elements and data contained within their information. Applications which are now thought of as being too small for SGML will logically begin to use DTDs as the way to define their information.
 
As XML is adopted for both types of applications, systems will expand to include both valid and well-formed XML. There will not be a distinct difference between the two from a user's perspective. The systems will provide the correct tools based upon the data component that is being edited, and will track whether the data is maintained following the defined structure. Templates will be used for many of the applications to simplify the creation and updating of the data.
 
 

XML Example: Software Manufacturer

 
To illustrate how XML will be used across a corporation, an example of a software manufacturer will be used. For this example, the software developer must create a range of information, starting with product requirements, to design information, product documentation, marketing descriptions, pricing, sales proposals, customer profiles, and support documents such as commonly asked questions and answers. The example will show how data components are created and shared between the various applications.
 
Our example starts out with a product marketing manager writing product requirements. In many organizations today, that information is written into a document often called the Marketing Requirements Document. As the name implies, it is a document that contains a range of requirements for the next version of the product. The document is typically not reviewed until it is nearly complete, and then it is reviewed in its entirety. With XML, templates can be used to capture individual requirements. As the requirements are written, they can each be stored independently as components. Each component can be routed to subject experts for review. Requirement components can also be grouped into logical units and circulated as a collection as appropriate.
 
As the requirements are being written and agreed upon, engineering can begin designing the product. As the design is written, it can also be stored as components in the system. The engineers writing the components can establish direct links between the design components and the requirements from which they were generated. Establishing the link between the requirements and the design will enable product marketing to more quickly assess how the design meets the criteria, and engineering to more quickly determine necessary design changes as the requirements change. In addition, design notes and other information can also be stored within the system as complementary data components for future reference.
 
As the design is being completed, the documentation group can then begin to use both the requirements and the design to write the documentation. The documentation group will use a DTD to create their information. The components they create will be collected to form documents, and may be directly reused in other applications where a defined structure is important like the training materials. As with the case of the design documents, by establishing links between the requirements and the design, the documentation group can be notified through system capabilities like triggers when the information has been updated.
 
The marketing department will also access these components to create product descriptions and data sheets. Although the source information may be used directly, the marketing department might need to add additional descriptive information to make it more understandable to an outside audience. When writing that additional information, the data may deviate from the original DTD. The XML system will detect that the component no longer parses against the DTD, and will store the component back in the repository as a linked, duplicate copy. Although this may seem to be the equivalent of today's cut and paste metaphor, the link that was established between the source information and the modified, well-formed version enables the system to signal the marketing department when the source component changes.
 
This model of accessing the source components and using them for other purposes can also be seen in other functions such as the sales department for proposal creation and the support department for frequently asked questions. These departments will also create customer profile components. Those profiles can be accessed when a customer calls for support to determine not only the customers configuration, but also descriptions of how the system is being used.
 
In this example organization, document management technology is used to manage the information components. The system stores the data as either valid or well-formed objects. The system understands how the components can be collected to form documents, or as collections of related information. For example, all components that relate to a particular feature may be held together in a collection: requirement, design, documented use of feature, description, and frequently asked questions. Workflow and triggers are also used to route the information and to send notification when source data changes.
 
 

Summary

 
XML-based tools and systems will be deployed across all functions and applications within corporations. XML will provide the means for organizations to begin accessing and sharing information between departments and functions. The neutral encoding of XML will enable the data to be shared between different tools, while the structure added to the information will permit the data to be created, updated, and maintained as components rather than documents.
 
To enable to paradigm shift from documents to information components, organizations will facilitate sharing by providing templates and tools to help guide the users in creating and updating the data. Other mechanisms such as links and triggers will keep users notified of changing source information.
 
As XML tools become widely deployed, organizations will begin to experience many of the same benefits seen today through SGML implementations. Although every application will not witness the same payback that publishers see with SGML, corporations as a whole will find employees accessing and using information they previously did not find until too late in the process. That improvement will help organizations become more efficient and help them meet many of their goals to improve product release schedules and to become more customer-oriented.

The WISDOM of working on the Web   Table of contents   Indexes   Information Management - Who gets the benefits