| The XML Assembly Line: Better Living Through Reuse | Table of contents | Indexes | Problems with linking, and reuse of text | |||
XML data processing and Relational Database Systems |
| Dr. med. Noelle Guido |
| Managing Director |
| MED medicine online GmbH
Friedrich-Ebert-Strasse Bergisch Gladbach NRW Germany 51429 Phone: +49 2204 8437 30 Fax: +49 2204 8437 31 Email: noelle@medicineonline.de Web: www.medicineonline.de |
Biographical notice: |
ABSTRACT: |
Introduction |
| Large XML Files |
Whereas XML is mostly discussed only as a data interchange format, we think that XML will grow up to a storage and object format. XML is especially suitable for working with unstructured data, and so medical data. |
No matter if this is an effect of β-Version-Software or the (non) powered hardware, we think that we have to develop database processing methods to work with large XML files. |
Technical Environment |
| Microsoft Platform |
We are working completely on a Microsoft-based platform, which means to use MS Windows NT with MS Backoffice as a server-solution and MS Internet Explorer 4 or 5β on the client platforms. We use client- and server-side scripting, the XML DOM-Parser from Microsoft and for the database access the Microsoft ADO extensions via OLE-DB/ODBC. |
Possible Solutions |
Filestorage |
Therefore XML as a fileformat is not suitable to work with a large amount of data. Poor performance, missing security aspects and locking problems, for example write-access in a multi-user environment, force an issue to utilize databases (Figure NOE-005 ). |
|
Storage of XML in a file
|
||||||
| XML in a relational database |
Database Storage |
Storage of single element-values and attributes in a table |
One possibility is to store each single XML item in a single table-row. The parser have to split the XML document in the database or combine the single elements from the database to a complete XML document (Figure NOE-008 ). Thus the table-field ID must contain informations about the element level to rebuild the document properly. In large documents this procedure will also cost performance and time. The table will grow up very fast and contain a lot of rows. The benifit of this method is the possibility to identify each item and element with usual SQL (Structured Query Language) statements. |
|
One element - one table row
|
||||||
Storage of complete nodes in a table |
Another possibility is to split the XML document in node-fragments and store these in a table. For example in a XML document which contains (all) patients in a hospital you would put each single patient in one record. The XML fragment would be stored in table blob-fields. The records are identified by an ID as an unique primary index (Figure NOE-010 ). The performance is much better because the parser has not much to work. On the other hand the data retrieval with SQL on blob-fields is possible but will be disperformant. A better solution for solving this problem is to built meta index structures in a second table as shown below. |
|
XML fragments in a table row
|
||||||
Building meta index tables |
| meta index table |
In a second table, called meta index table , we put information about interesting element or attribute values. The column Index Name represents the element name, the column Index Value the value of the element or attribute, we want to search for. The "REF-to-ID" column contains a reference or link to the row in the XML table, which includes the XML fragment we search for. So with a simple SQL statement we are able to localize the XML fragments of interest (Figure NOE-012 ). |
|
SQL queries over a meta-index table
|
||||||
Normalization |
In more complex data structures you have to distribute the information among different tables and cross-reference the information by special XML attributes like ID and REF . For example we build one table with patient-information (name, address, ...) and another table with diagnosis-data (Figure NOE-014 ). In contrast to "normal" SQL statements with joins here we have to define our SQL statements step by step. |
|
Referencing tables
|
||||||
Next Steps |
| eXot - extensible organ specific tumor documentation |
There are still some problems which we want to solve in the near future. In our eXot -project (Figure NOE-016 ), an organ specific tumor documentation under XML , we actually evaluate XML in our application design on each layer: We built prototypes for dynamic visualization of a reference information model in XML or in a rational database system, dynamic queries with XSL (Extensible Stylesheet Language) on XML -Data in databases, distribute information with server-server communication servlets and design a XML -based form modeller, which allows to define XML -based input-forms in a HTML side. In eXot there is a HTML framework with global function definitions in Java Scripting, the (organ) specific inputform is written in XML -Code which is dynamically loaded in the framework. Our present findings give us cause for optimism to develop creative solutions for future problems in XML . |
|
eXot - eXtensible Organspecific Tumordocumention in XML
|
||||||
Conclusions |
| XML - the better alternative |
We think that XML will become in the near future more and more importance for creating dynamic user interfaces, manage application business logic and distributed data storage. Specially for the growing requirements of computer-assisted managed care applications, quality assurance programs and medical documentation XML already is not only suitable, it is rather the better alternative. Nevertheless XML will not replace classical tools like relational database systems, but make use of them. |
Acknowledgments |
The author wants to thank Prof. Dr. Dudeck and his team for their engagement to establish XML in healthcare. I hope that in future they can fill even more poeple with enthusiasm in working with XML . |
| The XML Assembly Line: Better Living Through Reuse | Table of contents | Indexes | Problems with linking, and reuse of text | |||