| SGML Template Driven Database Extraction: A New Approach to Report Generation | Table of contents | Indexes | Extranet SGML editorial system for encyclopaedias | |||
| Chelsom John |
Defining Reusable, Distributable Information Objects Using SGML or, How SGML can do for Databases what JAVA has done for User Interfaces. |
Abstract: |
| Just as JAVA has brought an open, distributable way to enable users to interact with data by transferring applications in real time from server to client, so SGML can enable them to interact with persistent database objects by transferring, real time, the database schema for those objects. This talk explores the potential of SGML as a universal database definition language for reusable, distributable information objects and shows how existing technology is already turning that potential into reality. |
Introduction |
| The rapid growth of Web and Intranet technology led initially to an explosion in the amount of information created and stored as HTML. Organisations who are serious about using Internet/Intranet as a means for disseminating high quality information have realised the importance of managing that information in a controlled, structured environment, rather than relying on the limited capabilities of HTML. For these organisations HTML has been relegated to a delivery format, whilst databases have regained their position as the proper place to manage valuable information assets. |
| Database vendors have quickly embraced the new opportunities and now offer integration between their database servers and Web servers - often as part of an integrated Web or Intranet management package. The gateway between the database and Web servers converts incoming HTTP requests from the Web clients into database queries, which are submitted to the database server and the results returned to the web client as HTML pages. |
| This works fine so long as users know what to look for in the database - the problem is knowing the right questions to ask! To find out, client-side applications need to know the database schema. Enter SGML. When the database is defined by an object schema using an SGML DTD, it is not only easy to configure, but the schema (DTD) can be transmitted to remote clients and used to construct queries and view results. |
| Just as JAVA has brought an open, distributable way to enable users to interact with data by transferring applications in real time from server to client, so SGML can enable them to interact with persistent database objects by transferring, real time, the database schema for those objects. |
| This talk explores the potential of SGML as a universal database definition language for reusable, distributable information objects and shows how existing technology is ready to turn that potential into reality. |
How Web Technology Has Revolutionised Client-Server Computing |
| Client-server technology brought about a revolution in the multi-user computer environment. In the old order the mainframe was king, holding all the data and running all the processes, with users connected to the central processor through terminals which relayed information, but which had no processing power. With client-server computing, the application processes became distributed around the client machines, using data that were held on the central server. The arrival of Web technology on the Intranet has brought a new revolution. Web technology has taken away the application processing power of the so-called fat clients in the client-server environment and returned that processing to the server side. |
What JAVA Has Done for User Interfaces |
| As with many revolutions, the initial euphoria of overthrowing the old order swung things rather too far to the opposite extreme. The first Web browsers hadn't just become thin clients, they were dangerously anorexic. Even the simplest interaction with the user required instructions to be sent back to the server which formulated a revised user view and transmitted it back to the client side. Web browsers offered a way to view information in a platform-independent way, but they completely decoupled the user from the server-side applications. |
| The Web client has been fattened up and restored to health by the introduction of JAVA applets (or ActiveX controls, for lovers of Microsoft). These allow users to interact with information downloaded from the server without further communication between client and server. In the same way that the first browsers allowed information to be located, downloaded and viewed in a platform-independent way, so JAVA-enabled browsers allow applications to be stored on the server, downloaded and executed on any client platform that supports the Java Virtual Machine. |
How Objects Have Transformed Databases |
| The Web isn't the only revolution to hit computer technology in recent years. The object revolution which began first with user interfaces and then conquered programming languages has finally, in the 1990s, caught up with databases. Whereas databases once stored only information that could be represented by simple data types and relationships, now object databases store multimedia data types with complex, hierarchical relationships. |
What SGML Has Done For Documents |
| SGML has brought the object revolution to the realm of documents. Before SGML, electronic documents contained unstructured information. Or to be more precise, they held information that was structured in the minds of the authors and readers, but which couldn't be held in a structured way by computers. |
How Documents Have Become Databases |
| Databases store multimedia objects with hierarchical relationships; SGML delivers hierarchically structured documents with multimedia entities. The line dividing databases and documents has become blurred. Databases become documents; documents become databases. Although many SGML databases are still implemented successfully using relational databases which generate SGML, store SGML chunks or glue together micro documents, the technology now exists to build systems which provide a seamless transition between SGML documents and database objects. |
How Web Users Can Access Databases |
| For some time, Web users have been able to access server-side databases through a Common Gateway Interface (CGI). On the client side, the user formulates some sort of information request by interacting with an HTML form. The request is passed from the Web client to the Server, which invokes the CGI script specified in the HTML form. This script can be written in any programming or scripting language, provided it adheres to a few simple rules on its inputs and outputs. The input is the information request from the Web client and the role of the script is as follows: |
| Another method is now offered through JAVA. The user locates a JAVA applet which is downloaded from the server to the client browser. This applet displays a form or other interface object to the user who creates the information request. The applet then communicates directly with the database through a socket connection, bypassing the Web server. Results of the database query are passed back along the socket for display to the user in the applet. |
How SGML can do for Databases what JAVA has done for User Interfaces |
| There are several problems with the two methods of database interaction described above. The first is that the user is offered a fixed view of the database, either through an HTML form and CGI script or through a JAVA applet. This means that the information creators also have to construct the views that will be presented to the user (i.e. write the applet or the CGI script). The fixed view also conflicts with the culture of discovery that pervades the Web; ideally users would browse and query databases in the same way that they browse documents and formulate text queries. However, to formulate a sensible query to a structured database requires knowledge of the database structure (or schema). Its no good searching for names and addresses in a database that only knows about fruit and vegetables. |
| The second problem relates to the split between server and client side processing of information. Suppose a user searches a database and is returned thousands of objects to their client side browser, when they were really looking for a single search hit. Ideally they should then refine the search using the same interface as for the original query. However, this second search operates on the information set that was returned as a result of the first search; information which now resides on the client, not the database server. |
How to Define Reusable, Distributable Information Objects Using SGML |
| What technology is required if SGML is to be used to define reusable, distributable information objects along the lines outlined above? |
| Does this technology exist today? Not quite, but almost. There are object databases available which meet the first requirement, although they don't use a standard query language. This language could, in theory, be provided by DSSSL, as could the requirement for standard semantic descriptions of SGML elements. That leaves the applets, which are well within the capabilities of JAVA. The challenge is simply to build them! |
| SGML Template Driven Database Extraction: A New Approach to Report Generation | Table of contents | Indexes | Extranet SGML editorial system for encyclopaedias | |||