Building an XML Publication Warehouse   Table of contents   Indexes   Intelligent Graphics: Towards a viable architecture using the most appropriate standards

 

XML and Information Modelling

 Corkern, Carla 
 Dallas 
ISOGEN INTERNATIONAL CORP
 Texas 
 USA 
 
Carla  Corkern
President,  ISOGEN INTERNATIONAL CORP 
 2200 N. Lamar #230
Dallas  (Texas)  (USA)

Biographical notice

Carla K. Corkern is the President and co-founder of ISOGEN International Corp. Ms. Corkern's other responsibilities and activities within ISOGEN include technical systems analysis and design, industry marketing presence, sales support, process analysis and design, technical training. Ms. Corkern was the co-chair of the 1998 MetaStructures Conference with Dr. Steven Newcomb. Ms. Corkern is the President of the OASIS Board of Directors and is a newly appointed member of the GCA Board of Directors. Ms. Corkern is an 8 year member of the Telecommunications Industry Forum Information Products Interchange (TCIF/IPI). Ms Corkern was recognized for her work in the Telecommunications Industry in 1992.

Prior to beginning ISOGEN-related business activities, Ms. Corkern has had an focussed career in the design and implementation of technical communications, documentation systems development and support, process re-engineering analysis and support:

 
  • Support, design, and implementation of processes and systems, and system administration for a technical communications department of a global telecommunication firm, Ericsson Network System, from 1990 to 1992.
  •  
  • Technical analysis and information engineering on multiple projects for Intergraph Corp. from 1989 to 1990.
  •  
  • Technical analysis, product evaluation, and information engineering for Tandy Corp. in 1988.
  • Ms. Corkern possesses a Bachelor's Degree in Technical Communications from Louisiana Technical University (1989). She has post-graduate work in Literature and Sociology programs from 1993 - 1997 at Southern Methodist University.

     DTD design  
    Information Modelling
    Information engineering
    XML application
    smart tagging
     

    Introduction

     Is there more to XML than just peppering your HTML DTD with new tags? Maybe or maybe not depending on your particular information needs and what you plan to do with your data. may be required if the is mission critical and the data is highly structured. But it may be overkill if you are trying to add to a news feed. And if you are publishing from an existing database, you might just find that a good part of your has already been done for you. This paper will focus on XML and Strategies.
     Data Modelling 
    business rules
     

    The Need to Model

     Information Modelling or Information Engineering is a discipline for designing information systems across an enterprise. Companies use the Information Engineering concept of to model the critical data that is used throughout the enterprise. Information Engineering as a discipline does not define a storage medium but the methodology is most often encountered in relational database design. Because of the limitations of various storage models, it is important to capture the of the organization in an abstract form before you begin actual coding of the software system. These models are used to explain the system, normalize various systems across organizations and establish buy in from non-technical decision makers.
     Data Modelling describes the entities of an enterprise and classifies their attributes and their relationships. The usage of the terms entities and attributes are a bit confusing for the SGML-savvy because they are not strictly analogous. Entities in Data Modelling terminology are usually described in SGML as elements but information engineering attributes are often also complex enough to be describe as SGML elements.
     The relationships defined by Information Engineering are also more inclusive than the limited set (sibling, descendant, parent) described by SGML and XML. The relationships describes by Data Modelling can be anything that makes sense in the business enterprise. The following example shows a basic chip design software product:
     The basic building blocks of a product is a set of software features. The software features are identified by the designh method they support, and the stage of the workflow in which they are active. The software features are used to program a device; the device is identified by name, architecture, family, architecture device, and part number. The design theory describes the optimal usage of the software features. Software features are implemented in a user interface. The user interface is identified by either GUI or Command Line.
     If this were to be described in XML, some of the information is important only from a human knowledge stand point. Some of the information is best captured as an XML fragment, for example, you might describe the device:
     <device><name>My Chip</name><architecture>My Architecture</architecture><partnumber pn="j1234"/></device>
     The description of the device is only part of the overall information model but it's relationships to the overall model cannot be described using XML alone. The data model of the entire system might describe how bits of documents are organized in a particular product help build or how a particular set of features is reused across a series of products. XML allows you to richly describe the data in your information nuggets but by looking at the overall structure of the information, you can determine how a search engine might be programmed to only provide features to support a particular device or only to show devices that are grouped in a particular family. While XML can give you the lingua franca for delineating this information, it cannot always give you the relationship description required for your information system.
     

    Document Analysis

     How does the traditional SGML practice of document analysis relate to data modelling? Data modelling starts with fewer knowns. SGML document analysis starts with what is known ( a set of documents or HTML pages) and looks for what information needs to be "singled out" with special tagging. Data modelling starts as a conceptual exercise of what is my business? what information do I want to convey? How can I best describe what is important about my business through my data? Document Analysis and Data Modelling share many characteristics of process:
     
  • Every good information analysis exercise requires that you get the right people together. Stakeholders in your organization need to get together to determine the best methods for capturing, structuring and presenting your information.
  •  
  • You must examine your organizations reasons for moving to XML - is there a corporate mandate? Will your web site be re-engineered for "smart searching" - what is the blue sky vision? All members of the team must be on the same page before you can redesign your information.
  •  
  • If you are providing information to a trading partner, is there already a defined data model that you must comply with like the Common Business Language (CBL) or the KONA Architecture for Healthcare? If there are already models published for your application or industry, you want to be sure you are aware of them and use in your own modelling exercise.
  •  
  • Determine if your group can unilaterally design information models or if you must comply with a corporate data catalog. Many organizations who have multiple active web sites querying the same databases prescribe a series of "named fields" that must be used across all XML models. What if you don't want to be limited by these models? See my fairy tale paper from last years conference at http://www.isogen.com/papers
  •  
  • Establish a process for your analysis and a form for documenting your work. It is also critical to establish a deadline. Many companies hire consultants to work with them through the initial information modelling and analysis phase. Once the design is set and all team members agree, implementation groups can be splintered off to work on various phases of the design from the same data model.
  •  

    Conclusion

     The science of information engineering is largely untapped by XML today but serves a useful tool when designing information systems that are mission critical for the future. It is estimated by experts that much of the "XML" currently available is either generated from databases or HTML++, however, as XML becomes more prevalent, we should look for systems to be designed from the ground up to use this powerful descriptive language. As XML is called upon to do the critical work required in the 21st century, the methods and tools for creating applications that use it must also evolve.

    Building an XML Publication Warehouse   Table of contents   Indexes   Intelligent Graphics: Towards a viable architecture using the most appropriate standards