Converting Flat File Content into XML and Vice Versa   Table of contents   Indexes   Using XML and Relational Databases for Internet Applications

 Bellevue 
 DataChannel 
 Mikula, Norbert  
 USA 
 
Norbert Hannes Mikula
 Chief Technology Officer
DataChannel
  600 108th Ave NE Ste. 900 Bellevue (Washington)  USA (98004)
Email: norbert@datachannel.com Web site:http://www.datachannel.com
 Biography
 Mikula is responsible for directing DataChannel's efforts in the XML standards bodies as well as further advancing DataChannel's leadership in the arena of XML and Enterprise Information Portals. He developed one of the first XML parsers (NXP) and has been engaged in XML-related efforts since the early days of this standard. Currently, Mikula also serves as Chief Technical Officer on the board of directors of OASIS - the Organization for the Advancement of Structured Information Standards. He is the author of numerous whitepapers and articles and has been a speaker and/or track-chair at a variety of national and international conferences and industry events. Mikula holds a "Diplom Ingenieur" degree in Applied Computer Science from the University of Klagenfurt in Austria.
 

Enterprise Information Portals

 

Enterprise Wide Table of Contents

 Nobody needs to be educated anymore on the role of information as resource of equal importance to money, human resources and natural resources. The competitiveness of an organization will largely be determined by the efficiency it harvests and uses the wealth of information captured in the thousands of internal legacy systems, in the minds of its employees, customers and partners but also the Internet (the largest database in the world) in general.
 The first obstacle to overcome is the distributed nature of a typical medium to large-scale organization. The web-systems, databases, file-servers and legacy systems are placed in different departments, different groups, and different company offices, even different continents. Knowledge workers need to get access to information as part of their daily routine; they do not care where data resides or what protocol (ftp, http, etc.) one has to use to access it.
 The first step is categorizing information based semantic context rather than location and system characteristics. We call these categories "virtual folders" or "channels". A channel is a container for information, which share a context. For instance, the "Sales Channel" may contain a word document with the quarterly sales statistics, a databases generated report from an RDBMS located in Europe - served up via the European webserver - as well as an HTML page featuring the sales person of the month (and so on) and the ticker information from some external financial web-server.
 An Enterprise Information Portal (or short EIP) allows you to manage such a set of channels. It becomes the table of contents to company wide information resources as well as external information. Channels are hierarchical - just like your file system - in order to provide you with the right level of granularity. Users don't care about what protocol is being used to access to information, neither do they care what media type information is encoded in. Import is to find information that is of relevance to a certain subject. Virtual folders accommodate that requirement.
 
 Content aggregation
 
 The metadata of channels and the relationships between channels is of course described in XML.
 

Personalization

 Imagine a set of 100 - 200 channels. Suddenly you are back to the problem of not being able to find the information you need. A personalization component - which should be part of any good EIP - allows you to pick and choose and put together your own customized table of contents - your individual "MyCompany.com". Needless to say that you expect a good EIP to proactively notify you (per e-mail for instance) if new content was put into your registered channel or if existing content has been changed. To simplify management, this personalization should be on a per-user but also per groups of users level.
 

Backend Integration

 Now we have established a central access point to all corporate information. All? Wait a second. What about my dusty IMS data and my huge RDBMS that is not connected to the web? Sooner or later you will hit upon a system that is not accessible but you want to provide a link to it from your new cool "MyCompany.com". XML plays in important part in backend data integration. Instead of writing numerous connectors, each of which converts the original data into some other proprietary or not-fit-for-reuse data-structures, let's rather use a data-format that captures semantics and context and allows for the re-use of the same dataset n-times. You guessed right - XML.
 From XML we re-purpose the captured data into various output formats for delivery through our portal. The web (HTML) is one of them. It's only a style sheet - after all.
 
 Multiple Delivery
 
 

Publishing

 Now we have the table of contents and the backend integration being taken care of. Now I am sitting writing up my trip report about my last trip to New York and I want to publish it to "Norbert's trip reports" channel.
 Scenario a.) I send to the sys-admin by e-mail and ask her to publish it onto the (web) system and then make sure it gets a link from the portal. Given that we are going to "Internet World" she is busy like crazy and the file can't be published until a week from today (by that time nobody cares about my latest news and gossip from the industry - since after all in Internet days these companies have probably merged with somebody or gone out of business by the time the report is published.) (best case: 15 minutes; worst case: 7 days)
 Scenario b.) I "drag & drop" my report over the "Norbert's trip reports channel". The system by itself takes that document and publishes it to the portal. It also extracts meta-content and puts it into XML. That meta-content is then send to the portal server as well. The system can now turn around and notify user, which are registered with my channel about the new document (best case: 1 second; worst case: 2 seconds).
 

It is not a shrink-wrap world

 "Brave new world", right? Wrong! Developing a corporate portal including the necessary integration of backend legacy system is not easy. No ones system equals the others. A minimal portal can probably be done using an Enterprise Information Portal Server in a few hours, especially if your datasource are already web-enabled and ready to roll into your portal for distribution. However, in general, a large-scale enterprise-wide system will take a serious effort (which takes time and costs money).
 "An Enterprise Information Portal is not a product is a carefully architected solution"
 Ok, now relax again after the cold shower. Vendors can offer you pieces like a portal server and some backend connectivity out of the box. And most certainly, if you work with a solutions (rather than product only) vendor, they will have a professional services arm that can get you through the bumpy ride.
 

The XML Backbone

 

Industry landscape

 More and more vendors have started to develop (or ship already) systems that can read, write or read and write XML data. Databases vendors, editors, document management systems etc. etc.
 The effect of this is that we have all started building what is known as the "XML Backbone". XML becomes the abstract data interchange syntax for the intra and inter-corporate data-bus.
 

Platform and protocol independence

 The XML backbone is not bound to a particular operating system or platform (well it is XML based after all). Also, it is not bound to a particular transport mechanism, the XML backbone is build using FTP, HTTP, e-mail, message queuing and by using object request brokers (for instance). It does not matter. What does matter, is that we use XML for describing the data and metadata.
 

DOM

 The DOM is the API to data of the future. The DOM is the Document Object Model (W3C) and it provides programmers with and API to work with XML data (XML is only a file or data stream after all).
 Now the DOM - especially in its current form - may not be best suited for that in all cases, but a.) it will continue to evolve and b.) it becomes the de-facto API to data because everybody treats it as such.
 

Cost savings

 Developing the XML backbone and using software that by using the DOM and XML plugs right into the XML backbone will over time lead to a significant cost reduction.
 
 XML Backbone
 
 

Standards, standards, standards

 

The role of standards

 XML and Enterprise Information Portals will be an important part of your computing infrastructure for years to come. You will not want (I hope) to be locked into a proprietary data cemetery. I bet you want a system that is build on open standards that allow easy integration with other standards based components and systems. A list of core standards to look out for:
 
  •  XML (for data)
  •  XSL (for rendering and transformations)
  •  http (for transport)
  •  WebDAV (for collaborative work)
 

WebDAV

 WebDAV (Web Distributed Authoring and Versioning) is a suite of specifications, which transform the web from read only to read and write where people can collaboratively work on documents.
 WebDAV is an extension to http and thus ties in very nicely with existing web infrastructure. XML is being used as message syntax for the exchange of information between the client and the server.
 For an Enterprise Information Portal it means it ties natively into WebDAV enabled clients such as Microsoft Office 2000 and WebFolders.
 

XSL

 The "Extensible Stylesheet Language" is crucial for an XML based Portal in two ways.
 For one XSL is being used to translate the XML described portal metadata into HTML to create a highly dynamic and flexible user interface. HTML is only one of the possible output formats. One and the same XML data can be used to create numerous other output formats such as WML, VoxML etc.
 Secondly is important for the content portion of a portal if XML is being used. Let's imagine XML data-streams coming from various different backend systems (see also our section about backend integration). We can use XSL to take these input streams and create higher order business objects by merging these data streams. After this we still can repeat the process described in the previous paragraph.
 

General Motors Case Study

 General Motors is not different than any other large-scale company (other than everything extra-large compared to other systems). Tons of legacy systems in numerous different systems.
 

The challenge

 
  •  Provide means to access a number of database systems and one proprietary CAD system
  •  Personalize the access to this information by means of a portal
 

The solution

 We used our own set of tools which XML enable RDBMS systems via JDBC and thus enabled these systems to return XML. We used XSL to combine the incoming data-streams into a set of business objects (which have been defined working with the customer).
 Eventually that data was distributed and personalized via means of our Enterprise Portal Server.
 

Summary

 In order to provide effective means for information distribution looks very closely at the power of XML for data and metadata markup. Combine this with the power of an open standards based Enterprise Information Portal for Personalization, Notification, Distribution to heterogeneous devices and Collaborative web-native Work. And remember, "It is not a shrink wrap world".
 

About DataChannel

 DataChannel simplifies the critical process of delivering "the right information to the right people at the right time." Privately held and founded in 1996, DataChannel, Inc. is an Enterprise Information Portal (EIP) Solutions company that facilitates the way companies share information with employees, partners, and customers across Intranets, Extranets, and the Internet. DataChannel's EIP Solutions Framework combines XML-based products, professional services, and xPertPartners to build EIPs that meet customer needs. DataChannel Server 4.0, a portal server, is the core component of the EIP solution, and the foundation for building an effective EIP. DataChannel's EIP Solutions Framework is supported by its XML Framework, which includes XML tools, technology, training, and DataChannel's xDev Program.

Converting Flat File Content into XML and Vice Versa   Table of contents   Indexes   Using XML and Relational Databases for Internet Applications