![]() |
XSLTVM - an XSLT Virtual Machine | Table of contents | Indexes | Graphics on the Web | ![]() |
|||
DBMS, DataBase Management Systems ![]() Data Interchange Databases ![]() XML ![]() | Informix & XML |
| in, out, and shakin’ all about |
| Brown, Paul |
| Paul Brown |
| Chief Plumber |
California ![]() INFORMIX Software Inc. Menlo Park ![]() USA ![]() | INFORMIX Software Inc.,
4100 Bohannon Drive Menlo Park California 94025 USA Phone: 650 926 6300 Fax: 510 628 3951 email: paul.brown@informix.com web site: www.informix.com |
| Biography |
| Abstract |
Database Systems Internet ![]() | Introduction |
DBMS, DataBase Management Systems ![]() | So technology vendors like INFORMIX are changing. We are making XML an integral part of our products and solutions. Traditionally, INFORMIX has been a leading provider ofDBMS software. With more people sharing more information, there is greater demand than ever for our scalable, transactional, data management products. Also, one of the basic requirements for web software is flexibility: web sites evolve rapidly, changing their look-and-feel, content, and the kinds of the services they provide. This makes declarative, query-centric interfaces, where a web application can ask and answer ad hoc questions, very useful. |
| But how do we view the combination of XML and DBMSs? And how do we think it ought to be done? |
Java ![]() Object-Relational Databases XML ![]() | Extensible or object-relational DBMSs |
| The point of this figure is to illustrate how sophisticated modern data management systems can be. And it also hints at the necessity of XML in this application. Where do the values in these tables come from? Given the variety of data islands involved (each wholesale supplier and trucking company probably has their own, existing management information systems, each with its own formats and structures for storing data) how can all of this be unified? The answer is XML. |
| The good news is that DBMS extensibility also means much of the plumbing necessary to make XML a reality can now be embedded directly into the DBMS. (This does not mean, of course, either that XML is the only way to talk to an ORDBMS, nor that an ORDBMS is the only use for XML!) Over the next couple of pages, we will see how this can be done. |
Parsers ![]() XML ![]() | Getting XML in |
| The problem with building this kind of system is the number and variety of islands of data involved. XML excels at overcoming this problem. Independently of the DBMS, our B2B site developers can create a set of DTD specifications to describe how information can be communicated. In our example application, such a message sample may look like this: |
|
| The trick, of course, is bridging the gap between the kind of data you see in , which might come from a variety of sources, and the kind of structure you see in , where end users answer their questions. |
| One of XML’s strengths lies in the way it employs standard ASCII text. Although accessing data within an XML document requires that you first process it, because of XML’s simple structure, writing parsers for it is a relatively simple programming assignment. Consequently, a variety of commercial quality parsers are available, for free, from various sources on the web. Many of these parsers are written in Java. |
| Extensible DBMSs can take Java code, and run it natively within the DBMS. Consequently, we are able to embed several, free Java XML parsers directly into the framework of our server. In below, we illustrate the general architecture. |
| For large documents and systems with high volumes of information exchange, such an approach has a performance advantage because it avoids the overhead of moving queries and data between an external program and the DBMS. It is also attractive from an ongoing administration and maintenance perspective because the embedded code is not linked into the ORDBMS as it might be with more conventional programs. Extensible DBMSs employ dynamic linking and invocation techniques that make replacing such a module as easy as dropping an empty database table. |
|
| This process is made easier when the overall bundle also includes: |
| SGML Markup | Getting XML out |
| XML is a derivative of SGML. So is HTML. For some time, DBMS vendors have been providing tools that can take the results of a SQL query, and return it marked up with HTML tags. It is a fairly straightforward engineering assignment to re-work these tools to handle XML too. |
| Web development tools like this rely on the way query results consist of a set of named columns. Data in an ORDBMS’s columns can be of a compound form (a single column containing multiple elements) or a COLLECTION (a single row/column data object consisting of a set of values). Fortunately, both of these novelties can be easily married to the XML data model. |
| In terms of our islands of data, getting XML out of the database makes it possible to encapsulate the functionality of a central server like the one we use in our examples. Other systems wishing to exchange information with it can send in their contributions in XML form, and receive responses in XML. An overall architecture that adopted this kind of approach might look like what we see in following. |
| In this figure we see how multiple, heterogeneous, islands of data can all share their information in order to achieve more individual business efficiency. In this figure, sub-sets of the information in systems developed by trucking companies and food wholesalers are exchanged (using XML) with a central B2B service. Using this service, other businesses can bid for allotments of perishable goods, making their valuation decisions not merely on the quality of the good on offer, but also on its geographic location, and based on whether or not it can be delivered. |
| In this example, we see XML being used both to get the data into the central store, and to get information out of the central store and back into each external information system. |
|
Document Object Models XQL ![]() XSL ![]() | Shakin’ it all about |
| Most data management companies and many web applications will adopt this kind of model. But it is unsuitable for every kind of XML. Another potential use for XML is in document exchange. In this problem domain, XML data usually exhibits much less structure than in the kind of scenario we envision earlier. Never the less, it is still highly desirable to store the XML data in a transactional system, and then to allow external users to interact with it: to query it, read it, and so on. In other words, in addition to getting it in, and getting it out, any complete XML story needs to deal also with shakin’ it all about. |
| Ultimately, an XML document can be completely unstructured but ‘marked up’. In this kind of document key words or phrases are tagged with a label than conveys semantic information. Sometimes these tags are indications for a user-interface program, but sometimes users want to ask questions about the contents of such documents. For example, they may want to say “Show me documents in the repository where the word ‘Paris’ is tagged up as a ‘destination’?” |
| The appropriate way to store this kind of document data is to do so using data management techniques like document indexing, query-by-document content, and so on. Object-relational DBMSs can be extended with this kind of functionality too. Whether or not you ultimately use a DBMS to store the data, an ORDBMS can play an invaluable role as index and scalable subject catalog. |
| Alternatively, in the absence of a style sheet or DTD, it might be desirable toshred an XML document. Shredding involves parsing the XML but instead of assigning values in its elements to corresponding rows in a table. An obvious challenge with such a strategy is how you maintain the XML document’s original structure within the Object-Relational model. |
Summary and conclusions |
| In this paper we have explored how XML and extensible or Object-Relational DBMS technology complement one another. In the short term, the importance and usefulness of XML in building web applications is as a data inter-change format, enabling information exchange between islands of data. But to use XML efficiently requires changes to how database management systems are built, and developers wishing to build effective web applications would to well to use object-relational DBMSs somewhat differently from how they used relational DBMSs in the past. |
| The key points of this paper are: |
|
| In summary, the three things you need to support XML are the capacity to get XML data into your database, get XML out when an external system requires it, and shake XML all about when you need to store it. |
![]() |
XSLTVM - an XSLT Virtual Machine | Table of contents | Indexes | Graphics on the Web | ![]() | |||