![]() |
XML schema design for business-to-business e-commerce | Table of contents | Indexes | Implementing an XML API for an n-tier eCommerce application | ![]() |
|||
| Agreement B2B transactions B2B, business-to-business ![]() Schema ![]() Standards ![]() Vocabulary XML ![]() XML Schema ![]() XSDL XSL ![]() architecture ![]() case studies data-centric development data-oriented development eBusiness organization skills | Using XML effectively in eBusiness architectures |
| Bodkin, Ron |
| Ron Bodkin |
| Chief Technology Officer |
C-bridge Internet Solutions Cambridge ![]() Massachusetts ![]() USA ![]() | C-bridge Internet Solutions,
219 Vassar St. Cambridge Massachusetts 02139 USA Phone: (617) 497-1707 Fax: (617) 528-1790 email: rjbodkin@c-bridge.com web site: www.c-bridge.com |
| Biography |
| Abstract |
Introduction |
Case studies |
Dealer extranet |
|
DTDs, Document Type Definitions ![]() | The message formats were defined using XMLDTDs . The DTDs were based on Ariba’scXML 1.0 specification, and extended to capture additional required data items and message types. |
cXML, commerce XML ![]() | The application processes incoming messages with a factory that uses a DOM parser to convert specific DTD types into specific object types. and below illustrate the approach used. The application generates outgoing messages by creating objects of the given type and having them emit an XML representation. The code that maps between XML and Java objects is hand-written. |
|
| This application uses XML for messaging, but only at the boundary of communication with other systems, i.e., it uses XML at the edge of its architecture. This allows the application to take advantage of the benefits of standards-based supplier integration, with minimal impact to the existing architecture and limited requirements for new skills. |
Software exchange |
| The second case is a software exchange application C-bridge created with a dot.com client. This system allows users to publish, locate, and acquire software components. It also allows communities of interested users to share information, and it provides news for them. Both publishers and users of software components use this system. The client’s business has challenging requirements for integrating third party capabilities, customization and adaptability, and leveraging XML information sources. |
| The software exchange uses XML to |
| The most important design deliverable was the logical data model, since it was used to determine the underlying data storage and service interfaces. This model was captured inXML schema (the draft standard). XML-oriented services, object designs, and a UI prototype were also important design deliverables to specify the system. This application has a great deal of persistent data, and the schema for the logical model was used to determine how data is stored in a relational database and in an LDAP directory. Using XML schemas to derive the database design represents a significant shift from traditional application development where a logical relational model is used to determine the application data model. The schemas were also used to determine the data interfaces for internal services. |
| The application architecture is illustrated in . Highlights include |
|
| By contrast with the first case study, this application uses XML intensively throughout. This placed an increased reliance on evolving standards and on less mature tools, but it enabled the application to achieve significant benefits and laid a foundation for the future. |
Architectural considerations |
This section assesses the impact of XML throughout eBusiness application architectures, using the three-tier architectural pattern as our frame of reference. We first describe how XML impacts a given part of the architecture, and then discuss the impact on the case studies in more detail.
|
Vocabularies |
There are several schema languages for representing the structure of XML data. Today, the most commonly used format is the XML DTD. The (XSDL) is a public working draft of the XML Schema Working Group of the W3C that is not expected to change significantly when released ( http://www.w3.org/TR/xmlschema-1/
). XSDL provides significant benefits over DTD such as much better data type support, better modularity, refinement of types, and the use of XML to define types. It is generally believed that XSDL will quickly become the standard representation format for XML documents when the W3C publishes it, because of these benefits and the momentum behind the standard. For example, the Garner Group recommends that organizations “plan to use XML schema (when approved by the W3C)”
|
|
| Applications that use XML can take advantage of this standardized data by using a variety of standard schemas to capture key concepts. However, these applications need to deal with the complexities of schemas evolving, managing overlapping schemas and integrating with partners that use different schemas to represent the same business concepts. While there is a great deal of activity in creating domain vocabularies, there is still a lot of work to be done and in many domains there is not yet a clear standard to use. |
| During the transition period to XSDL, application architectures will generally use DTD or draft XSDL to represent data. Using DTD allows use of more existing domain schemas and has the widest range of tools support. It can be augmented with tools that capture additional information (e.g., those that support Datatypes for DTD, http://www.w3.org/TR/2000/NOTE-dt4dtd-20000113 ). Today, many domain vocabularies are being represented in DTD, XSDL, and in other schema formats (which pre-date XSDL and C-bridge expect to be superseded by it). |
| Using XSDL allows immediate benefits, prepares for the future (minimizing the need for rework), and allows incorporation of future schema standards. Describing proposed standard vocabularies in draft XSDL is an important step toward better specifications, as well as preparing for the future. While fewer tools support draft XSDL today, this is changing quickly; both validating parsers (e.g., Apache Xerces) and tools to convert between draft XSDL and DTDs (e.g., Extensibility) currently exist. For more information on the use of XSDL, DTD, or other formats see also , , and . |
| It is instructive to consider some examples of whether to use DTD or XSDL to represent data. The Dealer Extranet uses DTD because it was completed in 1999 when XSDL was still undergoing significant changes. It also uses DTD because the limited use of XML reduced the benefits of using XSDL. The importance of working with a wide variety of partner companies and their different technical infrastructures made it critical to support DTD so partners could use the widest array of tools for integration. |
| The Software Exchange uses XSDL because it uses XML intensively throughout the architecture, it benefits greatly from more precise and modular specifications, and it would be costly and risky to retrofit a subsequent change from using XML DTD. The Software Exchange incorporates some custom code for handling schemas, in addition to using a validating parser. |
| Both example applications use standard vocabularies for part of their data representation (e.g., cXML, DSIG, vCard). However, additional information was modeled without a standard vocabulary, and the parts that did use standard vocabularies needed to extend and adapt them for their specific business problems. |
| There are important disciplines for selecting schemas, as well as for extending or creating them, much like the disciplines for buying and extending or building software components. The Gartner Group provides these guidelines: |
|
| Forrester Research advises |
|
| There are a number of good sources for information on effective DTD and Schema design: http://wdvl.com/Authoring/Languages/XML/Schema.html provides a number of good links on designing Schemas; see also , and . The emerging field of XML Design Patterns has addressed significant attention in this area. For example, see"Introduction to XML Design Patterns" at http://www.groveware.com/xmlbook/patterns.html , and http://www.xmlpatterns.com/ . |
Data tier |
| The data tier in a three-tier architecture is responsible for access to and storage of persistent data. There are a number of different approaches to persistent storage of XML information: |
| Key questions that determine the best approach include the size, complexity, and type of information (e.g., is it data or a document), the frequency, and type of access (e.g., is it frequently updated, or mostly read-only), and existing storage formats and mechanisms. |
| An important source additional source of data includes existing content stores, transactional systems, and partner Web-based systems. There are a great number of converters and adapters that allow these existing assets to be integrated as XML data and to be communicated with using XML APIs. Newer systems are building native XML interfaces. |
| Relational databases are a frequently used approach. Relational databases and XML documents have different data type systems. This discrepancy is referred to as impedance mismatch, a term that was first used to describe the same type of mismatch between relational databases and OOP objects (see and . The difference in type systems makes it important to design a strategy for mapping between the two representations. |
| and discuss different approaches to storing data, the trade-offs among them, and best practices for working with them. |
| We will explore some of the issues by considering our case studies. Both applications use a relational database to store the majority of their data, and needed a strategy for mapping between XML and the relational database. To demonstrate the different approaches, we present a simplified example of a sample order document (from above). |
| In the Dealer Extranet, the database is represented in a conventional relational form, and the data in an XML document is broken out into appropriate tables, which can be joined together to represent a complete XML entity. This approach is illustrated in and . This approach was taken because: |
| Accordingly, the time required to translate between an XML format and database tables was not a problem for this application. |
OrderID
|
Date
|
ShipToLocationID
|
| 1 |
5-Aug-2000 |
5 |
| ... |
|
|
| |
OrderID
|
SequenceNbr
|
Quantity
|
ProductSKU
|
1 |
1 |
200 |
3307 |
1 |
2 |
30 |
1205 |
... |
|
|
| |
In the Software Exchange, the database is represented in an XML-specific relational form, whereby the data in an XML document is stored as text
|
|
|
Functionality tier |
| The functionality tier is responsible for all the business logic, routing, and integration required to convert between application-focused data and core system data. Applications that process XML can use object-oriented, data-oriented, or a mixture of development approaches. |
| Processing XML in an object-oriented fashion involves encapsulating XML documents through native objects, which are specific to the problem domain. This provides the benefits of OO programming, which are compelling for applications that are doing complex or intensive computation (encapsulating data, flexibility, reuse of methods, proven object design patterns). |
| While there are some similarities between XML documents and objects, there is still an impedance mismatch, just as there is between relational databases and objects. This raises the implementation question of how to map between XML and objects. A typical approach is to map simple XML elements and all XML attributes into object attributes and to map complex elements into objects. As with other mapping problems, there is more complexity involved when mapping associations between items. Key issues include determining when to aggregate elements or when to use references, and how to represent references between items. |
| XML makes the use of data-oriented development a viable alternative. There are a number of different data-oriented approaches: |
| Data-oriented development allows straightforward, principled manipulation of data. There are a number of data-oriented techniques, which can be applied to the problem at hand. Many parts of an e-Business application are fundamentally responsible for data flow without complex processing: retrieving, transforming, displaying, validating, and posting data. In these cases, data-oriented development offers significant benefits. Indeed, the continued popularity of 4GL development environments for IS applications can be attributed to the benefits of data-oriented development. However, conventional 4GL’s suffered from the simplicity of modeling data as a relational database result set. By contrast, XML data-oriented development benefits from modeling data with a rich structure. |
| Data-oriented development for XML is a new approach, and when assessing the different techniques, it is important to consider the maturity, performance, and existence of available tools (including tools for supporting tasks like editing and debugging). Likewise, it is important to consider developer skills. Understanding XML data and how to operate on it is likely to become a common skill for developers; as this occurs, there will be a benefit from increased productivity. However, there is an investment involved in data-oriented development, unlike encapsulating XML in native objects, which lets developers operate using known techniques and built-in language facilities. |
| XSLT is sometimes viewed as a universal tool to solve XML processing problems. C-bridge believes that XSLT is appropriate for a subset of the problems that are best solved with data-centric approaches, where structural pattern matching is helpful. XSLT is neither appropriate for natural language translation, nor for complex algorithmic transformations. provides notes on when to use and not to use XSLT and on the two basic styles for XSLT transformations: push and pull. |
| DOM development allows more flexibility in the data structures that can be handled than encapsulating XML in native objects. However, this reduces type checking which can make development and testing harder. |
| One of the most interesting challenges in processing XML is handling evolution, overlap, and alternative schemas. Object-oriented designs can use encapsulation including multiple interfaces for different parts of an object, flexible mappings from schemas to classes, and parameterized objects to address this challenge. Data-oriented designs can use internal representation schemas, metadata, pattern matching, rules, transformations, bindings, refinements, and parameterized schemas. |
| The Dealer Extranet uses an object-oriented approach in which a factory class converts received XML into a set of domain objects. On output, a template generates XML messages, and the data is retrieved by binding from specific object attributes to output parts in the template. This approach leverages the existing development environment. |
| The Software Exchange uses several of the data-oriented approaches discussed earlier for much of the work of reading, transforming, and presenting data. Object-oriented approaches are used for operations like iterating over a set of items, performing calculations, and handling user-driven queries. |
Presentation tier |
| XML has most frequently been applied in the presentation layer of eBusiness applications. One common approach is using XSL transforms to convert XML into HTML (or WML) on a server. In future, XML browsers are likely to display XML directly, possibly using XSL formatting objects. There are a number of new efforts underway to use XML to describe device independent displays, such as XUL ( http://www.mozilla.org/xpfe/ ), XForms ( http://www.w3.org/MarkUp/Forms ), and UIML ( http://www.uiml.org ). XML is a natural format for describing presentations in a display-independent fashion. For more information on how XSL and XML can be used to present data see . |
In addition to presenting information to users, B2B applications often present information to other applications, to support integration with partners. Such applications require agreement on systems interactions and standardized vocabularies. The eCo Architecture provides a good framework for analyzing the different layers of complexity involved (see
). Agreement needs to address inter-company workflows, negotiation of which vocabularies to use, and technical messaging methods. Many applications today pass XML over HTTP and HTTPS (as in our case studies). This type of messaging leverages the widespread support for XML and for HTTP to allow diverse systems to communicate in a loosely coupled fashion. The Gartner Group refers to XML over Web protocols as the “Digital Dial Tone” and Forrester describes XML and HTTP as central parts of “Internet Middleware”. However, XML over HTTP or HTTPS leave many aspects of messaging open, requiring custom techniques to integrate among applications, especially for tighter integration. There are a number of proposals to solve these problems by addressing standard middleware issues (such as reliable messaging, name resolution, as well as how to pass variables). The W3C is convening a panel on XML and Protocols at WWW9. http://www.LWProtocols.org
provides more information and additional links on this important topic.
|
|
| The Dealer Extranet uses XML only for server-to-server messaging, as was described in the Case Study section. |
| The Software Exchange uses templates that are subdivided into logically distinct sections. Each section uses a parametric model to convert XML data into an HTML presentation format. This system enables presentation and workflow that is highly customizable for different roles. The presentation layer is structured as a set of components that expose internal XML interfaces. |
| The application uses the model-view-controller paradigm (see ), with the model being the functionality layer, and separate view and controller (application flow) components. Model-view-controller remains effective in a system with XML data and a declarative development approach. |
Organizational impacts |
| We now consider how using XML for eBusiness has impacts on organizations. XML is leading to standardization of business information through common vocabularies and defined agreements. This new level of standardization of business information requires organizations to participate in standards efforts, including decisions about how to influence standards, who to work with, how to engage, and how aggressively to track emerging standards. These standards will be driven at the much faster pace of the Internet era. The lessons from dealing with emerging technology standards must now be applied to business. |
| When integrating Web services into a solution, the quality of service—including availability, performance, reliability, security, and accuracy—is crucial. It is important to select partners who can deliver on Service Level Agreements When first integrating partners, it is important to work collaboratively throughout the development cycle including shared definition of requirements, shared design interfaces (vocabulary and agreement), jointly agreed upon project milestones, and coordinating closely on testing and refinement throughout development. After successfully integrating with pilot partners, the process needs to shift to a standardized approach, making integration of additional partners simple. One best practice is creating “starter kits” for the most common technology environments of partners, which minimize complexity and effort for partners. Likewise, producing a standardized (and rigorous) compliance test suite is important. |
| The data modeling and development techniques enabled by XML also have an impact on the skill sets required for application. XML metadata allows better separation of tasks between display, business logic, and storage, and it allows targeting different display types. B2B solutions require more flexibility for customization, which require techniques and new design skills to use. Use of XML to model data requires new skills, and new tools, as does working with XML in databases and other data storage media; monitoring, tuning, and analyzing data stored in XML-specific forms is not the same as traditional relational information. Indeed, the extensibility and richness of XML data require the same kind of conceptual shift in data modelers as was required in shifting from procedural to object-oriented for application designers. |
| As XML and data-centric development evolve, there is a constant stream of innovative new components, tools, and standards. It is important to track these carefully and to determine their functional and technical qualities and to balance the value add against risk. |
Summary |
| XML is already having an impact on business, especially for B2B processes. It is a young, fast growing technology and those who use it face the challenges of tracking its rapid evolution and of dealing with overlapping and evolving business vocabularies. Of particular note is the challenge of allowing deep integration between systems in a standards-based manner, so businesses can connect and collaborative quickly without reliance on proprietary technology. |
| XML offers significant benefits for e-Business, and is growing to play a major role. Today, XML is mostly being used at the edge of architectures and intensively within pilot systems. XML promises increased benefits with more intensive use, but the final extent of XML use when it matures is still an open question. |
| Among the most important benefits of XML are standardized business data, increased data flexibility, and common tools for working with data. B2B applications require more customization, and integration with partner Web services, which will be major impetuses for adoption. XML will have significant impact on, and provide benefits for, application architectures, extended enterprise integration, and organizational skill sets. XML is an important enabling technology for the B2B revolution. |
| Acknowledgements |
| Thanks to Alan Spencer, Scott Fleming, Steve Donelow, Jaikumar Nihalani, James Tauber, Mike Plusch, and Bill Pope for their collaboration, review, and contribution of ideas to the projects described. Thanks to Jim D’Augustine, Kelly Parr, and Alex Burdenko for reviewing this paper. |
| Bibliography |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
![]() |
XML schema design for business-to-business e-commerce | Table of contents | Indexes | Implementing an XML API for an n-tier eCommerce application | ![]() | |||