| The Future is Today: Case Studies in Innovation | Table of contents | Indexes | XML and Enterprise Application Integration | |||
Commerce One Inc. ![]() Koistinen, Jari Mountain View ![]() | Jari Koistinen |
| Manager, XML E-Commerce Components and Server |
| Commerce One Inc. |
| 2440 West El Camino Real, Suite 710 Mountain View (California) (94040-1499) Web site:http://www.commerceone.com |
| Biography |
Commerce One Inc. ![]() Davidson, Andrew Mountain View ![]() | Andrew Davidson |
| Additional Contributor |
| Commerce One Inc. |
| 2440 West El Camino Real, Suite 710 Mountain View (California) |
| Biography |
| Additional Author |
Commerce One Inc. ![]() Fuchs, Matthew Mountain View ![]() | Matthew Fuchs |
| Additional Contributor |
| Commerce One Inc. |
| 2440 West El Camino Real, Suite 710 Mountain View (California) |
| Biography |
| Additional Author |
Commerce One Inc. ![]() Jain, Mudita Mountain View ![]() | Mudita Jain |
| Additional Contributor |
| Commerce One Inc. |
| 2440 West El Camino Real, Suite 710 Mountain View (California) |
| Biography |
| Additional Author |
Commerce One Inc. ![]() Mountain View ![]() Schwarzhoff, Kelly | Kelly Schwarzhoff |
| Additional Contributor |
| Commerce One Inc. |
| 2440 West El Camino Real, Suite 710 Mountain View (California) |
| Biography |
| Additional Author |
Introduction |
Electronic Commerce Systems |
Requirements on Electronic Commerce Systems |
This Paper |
XML Schema and Programming Models |
XML Schema: an Enabler for New Programming Models |
| XML Schema languages are being developed as an alternative to XML DTDs, and are supposed to address DTD weaknesses. In contrast to DTDs, XML Schemas provide typing, reuse, and extension mechanisms. There are several XML schema proposals available such as [SOX, DCD, XML-Data]but a W3C recommendation has yet to be established. Commerce One has developed the SOX[SOX] schema language as a general XML schema language that also meets the specific requirements of global XML based electronic commerce. |
| The SOX[SOX] schema language provides the same basic capabilities as DTDs, and in addition it provides the following: |
|
| In addition, XML schemas enable us to introduce new programming models that are better suited to meet the requirements of market places, and that improve the productivity of electronic commerce application developers. By having typed schema information we can define programming models with strong typing. The versioning and extensions enable us to develop programming models that decouple the evolution of document types and applications. Namespaces enable us to avoid global coordination of document type names, which makes development, deployment and maintenance of applications easier. |
The Importance of Programming Models |
| We use the termprogramming model to denote the programming interfaces used by developers to build electronic commerce applications and services that manipulate documents. The purpose of a programming model is to present an abstract, but well-defined, interface to which applications can be built. In our case a programming model defines the syntax, semantics, and failure semantics, of accessing various parts of a document. The syntax defines exactly how a particular document type is mapped to calls and operations in a specific programming language. The semantics define the behavior of the system when calls and operations are performed. Finally, the failure semantics define under what conditions failures can occur, how they are presented to the application, and what actions an application can take to address a failure. |
| The choice of programming model influences how easy or difficult it is to change an application as new document types are introduced. In some cases we need programming models that allow data to be discovered dynamically. In other cases it is more important to easily extract and insert data of the correct type out of the programmatic representation of a document and thereby ensure a higher-level of safety. The characteristics of an application will be strongly influenced by the type of programming model that has been used. |
| From the general requirements on electronic commerce market place platforms presented above, we can derive requirements more directly related to programming models. In particular, we consider the following aspects to be of high importance: |
|
| Different applications will need different programming models. Furthermore, different programming models will have different characteristics, each satisfying the above requirements to a varying degree. The implications of choosing a programming model are quite large and will influence many different aspects of a system, not only the programming abstraction seen by application programmers. Programming models also have an impact on the reliability, safety, and performance of applications and systems. From the MarketSite viewpoint, we consider the following aspects as being of high relevance for selecting programming models: |
|
XML Programming Models |
| There are many possible XML programming models for electronic commerce systems. The question is which one optimizes towards the desired properties of electronic commerce systems. In this section we will describe the characteristics of three XML programming models and discuss the kinds of applications for which each is useful. The three programming models we consider are: |
| The DOM[DOM] and SAX[SAX] models have been available in the XML community for some time. The typed programming modelX2J is a new model developed by Commerce One as an alternative when ease of use, safety, and extensibility, are important. |
| In DOM, an XML document instance is structured as a tree where nodes represent XML concepts such as elements, attributes and entities. The DOM representation of an XML document is flexible and document type independent. An application using a DOM can discover the structure of a document without knowing the corresponding document type in advance. The actual content between the tags in a document is represented as strings. Because of the generic nature of the nodes in a DOM tree, and the lack of typing of the content of a document, one cannot extract any type information from a DOM tree. This characteristic is partially an implication of DTD limitations with respect to typing, i.e., lack of typing. The application developer can therefore choose to reinterpret a tag or its content as being of a different type, as the document evolves. |
| The flexibility of the DOM comes at the expense of safety. Programming to DOM is quite error prone. Common errors include inserting data of the wrong type and making mistakes in how DOM trees are traversed. The DOM does not provide any guarantees to the programmer beyond those given by XML with respect to how document types can be changed without affecting applications. Applications based on DOM are fragile to document structure changes. As the DOM contains no type information, there can be no static checking of application and document structure consistency. This defers the detection of errors until runtime. The lack of typing also precludes any ease of use on the part of the programmer --- it forces applications to do numerous conversions between strings and data types such as date, integer, float and so on. The need for data types is a big issue in applications such as electronic commerce --- besides the performance implications of not having to do numerous type conversions, data types provide safer interoperability among large numbers of actors, data base storage, algebraic manipulation etc. |
| An event API represents a document as a stream of events. Each event represents an XML concept such as a start-tag, end-tag, entity etc. An application receives these events and is driven by them. As with DOM, the data of the document is commonly represented as strings. SAX is an example of an event API. General XML event based models suffer from the same drawbacks as DOM with respect to typing and safety. Typing information from schemas can, however, be added to the events. The MarketSite XML event API has been extended to capture the typing information. |
| Using an event API requires that the application essentially parse the incoming XML instance document. This is significantly more complicated than receiving a data structure representing the document. However, it generally has a smaller memory footprint, and is more efficient, than the other programming models. |
| The typed X2J programming model consists of both, a generic set of classes and interfaces, as well as classes and interfaces that are specific for a particular document type. Document type specific classes and interfaces expose data as being of the type that was defined in the schema. If an attribute was defined as an integer in the schema it will be an integer in the programming model. The process of producing such a programming model requires an XML Schema that has type information. |
A Typed XML Programming Model |
Overview |
| The X2J programming model exposes an XML document instance to an application as a set of Java interfaces of specific types corresponding to the data and element types defined in the XML Schema. Assume we have defined an element type namedHuman and created the corresponding X2J type in Java. If an element is defined to be of typeHuman , the programming model will use the X2J Java type to represent elements of typeHuman . |
| The X2J programming model defines a mapping from SOX Schemas to Java Beans. On the Java side, the mapping includes a package of pre-defined interfaces and classes such asElementType ,Attribute ,URI ,Date ,NMTOKEN etc. It also contains exceptions that are used to signal failure conditions. |
| The mapping defines how SOX concepts are mapped to Java language constructs. As an example, an element type of nameQ is mapped to a Java interface of nameQ and a class calledQimpl that implementsQ . The content model of the element typeQ is mapped to specific attributes and methods in theQ interface andQimpl class. The unique namespace of the Schema is mapped to a unique Java package name. Attributes ofQ are exposed through a generated Java interface calledQAttributes . |
| To automate the mapping process we have developed a compiler that takes SOX schemas as input and produces the correspondingX2J representation. In addition, the compiler produces complementary classes for marshaling and un-marshaling the representation to and from XML streams. The generated code runs on top of the MarketSite XML runtime, which performs full XML Schema validation of XML instances and drives the un-marshalling code. |
| One of the primary goals of theX2J API is to reduce the risk of programming errors and make the programming of documents easier by providing a typed interface to documents. We believe this is important for global systems such as electronic platforms where application programmers vary largely with respect to programming experience and background. |
| TheX2J API is also designed to enable separate evolution of document types and applications by allowing existing business services to handle any new document types that are sub-types of document types previously known to the service. We achieve this by mapping schema inheritance mechanisms to Java extensions. This enables the incremental evolution of applications as documents are extended. |
| The following list enumerates a subset of the principles that governed theX2J mapping. |
|
| In the following section we provide a slightly more detailed example of how SOX is mapped to Java. Due to space constraints we will not provide a complete description of the mapping. For a complete description we refer to [X2J]. |
A Mapping Example |
| Let us assume the following-simplified-XML Schemas (see [SOX] for details on the Schema language used). The first schema describes anAddress element type with a sequence content model. The first element is of typeint , and the tag used in instances should be named EntityType. The following four elements are of type string with tag names of type Name, Address, City, and PostalCode respectively. Finally, there is an element of type CountryCode. We are not including the definition of ountryCode due to space constraints. CountryCode is an enumerated type for which the domain consists of two or three letter codes. |
| Finally, the schema defines an attribute indicating whether the address is a home or business address. |
<?xml version="1.0"?> |
<!DOCTYPE schema SYSTEM "urn:x-commerceone:document:com:commerceone:xdk:xml:schema.dtd$1.0"> |
<schema uri="urn:Address.sox"> |
<elementtype name="Address"> |
<model> |
<sequence> |
<element type="int" name="EntityType" /> |
<element type="string" name="Name" /> |
<element type="string" name="Address" occurs="*" /> |
<element type="string" name="City" /> |
<element type="string" name="PostalCode" /> |
<element type="CountryCode" name="State" /> |
</sequence> |
</model> |
<attdef name="AddressType"> |
<enumeration datatype="NMTOKEN"> |
<option>HOME></option> |
<option>BUSINESS</option> |
</enumeration> |
<required/> |
<attdef> |
</elementtype> |
</schema> |
| The next schema is calledShipToAddress and it extendsAddress . An instance document of typeShipToAddress will have the content defined byAddress appended by the additional content defined byShipToAddress . |
<?xml version="1.0"?> |
<!DOCTYPE schema SYSTEM "urn:x-commerceone:document:com:commerceone:xdk:xml:schema.dtd$1.0"> |
<schema uri = "urn:ShipToAddress.sox"> |
<elementtype name="ShipToAddress"> |
<extends type="Address"> |
<append> |
<element type="string" name="FreightType" /> |
<element type="int" name="ShipMethod" /> |
<element type="string" name="FOBPoint" /> |
<element type="string" name="FOBInstruction" /> |
</append> |
</extends> |
</elementtype> |
</schema> |
| The following is a valid instance of theAddress Schema: |
<?soxtype urn:Address.sox?> |
<Address AddressType="BUSINESS"> |
<EntityType>234</EntityType> |
<Name>Commerce One Inc.</Name> |
<Address>2440 El Camino Real</Address> |
<City>Mountain View</City> |
</PostalCode>90404<PostalCode> |
<State>US</State> |
</Address> |
| In theX2J programming model we have certain predefined types that correspond to general concepts such as ElementType,Attributes, and intrinsic data types. |
| Each element type is mapped to a corresponding Java interface with the same name as the element type. The interface has set and get methods corresponding to the content model as well as a method for getting the attributes defined for the element type. The set and get methods are typed according to what was specified in the XML Schema. As an example, the set and get method for the element State in Address will take and return a reference of type CountryCode respectively. |
| If the element type does not extend another element type, the mapped interface will inherit from a general interface called ElementType. ElementType specifies generic methods that need to be available on all element type interfaces. If an element type extends another element type, the corresponding Java interface will extend the interface corresponding to the extended element type. |
| If we map the schemas defined above we get the following programming abstractions in Java: |
public interface Address extends ElementType { |
final static public String DOC_TYPE = "Address"; |
public AddressAttributes getAddressAttributes() ; |
public void setEntityType(int s) ; |
public int getEntityType() ; |
public void setName(String s) ; |
public String getName() ; |
public void setAddress(String[] s) ; |
public String[] getAddress() ; |
public void setAddress(int index, String s) ; |
public String getAddress(int index) ; |
.... |
public void setState(CountryCode s) ; |
public CountryCode getState() ; |
}; |
public interface ShipToAddress extends Address { |
final static public String DOC_TYPE = "ShipToAddress"; |
public ShipToAddressAttributes getShipToAddressAttributes() ; |
public void setFreightType(String s) ; |
public String getFreightType() ; |
public void setShipMethod(int s) ; |
public int getShipMethod() ; |
.... |
}; |
| Note that all content model methods are typed and that the interface for ShipToAddress extends the Address interface. Furthermore, each interface has a set and a get method for a corresponding attribute object. The attributes for an element type will result in a separate interface with set and get methods for the individual attributes. The name of the interface is the name of the element type appended by the word "Attributes". The interface also provides a get method of an AttributesInfo object corresponding to each attribute. The AttributesInfo object contains additional information about the attribute, such as presence values. We outline the attribute object interface for the Address element type below. |
public interface AddressAttributes |
extends schema.jbmapping.Attributes { |
schema.datatypes.NMTOKEN getAddressType(); |
void setAddressType(schema.datatypes.NMTOKEN a); |
boolean getAddressTypeDefined(); |
void setAddressTypeDefined(boolean b); |
AttributeInfo getAddressTypeAttributeInfo(); |
}; |
| The actual implementations of these interfaces are not defined by the mapping. An implementation design will, however, naturally need to provide a description of how these interfaces are implemented. In the Commerce OneX2J implementation, each interface has a corresponding implementation class. Furthermore, we emit classes that perform the un-marshaling of XML instance document to Java objects and marshaling from Java objects to XML instances. |
| Elements in a schema content model can have different occurrences. If an element has exactly one or zero or one occurrence the mapping is a simple set and get method. If the occurrence is one or more, zero or more, or a range, the mapping is defined as two pairs of set and get methods. The first pair takes and returns arrays of elements. The second pair allows an application to set and get an element at a specific index in the array representing the element sequence. For an example, see the mapping of the Address element inside the Address element type content model outlined above. |
| A schema defines a namespace and may also use element types or data types defined in other namespaces. The namespace of a Schema is mapped to a unique Java package corresponding to the schema's namespace. Every type defined in a schema is included in that Java package. In the Java mapping imported schema types are always referred to by their fully qualified name. This is to avoid any name conflicts among names imported from different schema namespaces with the least implementation complexity. The drawback is that names generally are long inX2J generated code. |
An Application Example |
| The Commerce One XML runtime component allows an application to go back and forth between an XML instance document and an X2J object structure. That is, given an XML instance document, an application may receive it as an object structure defined by the X2J programming model. Also, any X2J object structure built by the application can be easily output as an XML instance document. Any XML instance documents that an application receives are validated according to the corresponding SOX schema. The application programmer can therefore be assured that all the data in the received X2J object structure is valid. In the case that the application programmer builds the X2J object structure, the typed interface ensures that he/she will not insert data of the wrong type or format into object structure. Hence the resulting XML document instance will be valid. Checks of required elements, attributes etc. will be conducted at the time of marshalling. |
| The following is a simple example of how an application may receive and manipulate an XML instance document represented by the X2J programming model. |
// Get the document from the MarketSite XML Runtime |
// The description of this is simplified for the purpose of this paper. |
Address addr = (Address) getDocument("addr.xml"); |
// manipulate received document object |
addr.setName("Brad Snyder"); |
// write it out to addr.xml |
java.io.writer w = new FileWriter("addr.xml"); |
((AddressImpl) addr).toStream(w); |
Characteristics of X2J |
| The main advantages of theX2J programming model are the following. |
| Firstly, the programmer is more easily able to get and set the data in a document using the actual type of the data. This provides a higher degree of safety since any errors can be detected statically at compile time. Furthermore, it relieves the programmer of checking for a number of data conversion error conditions and thus improves productivity. In contrast, a model such as DOM requires programmers to cast data to and from strings when the document representation is accessed |
| Secondly,X2J maps Schema inheritance to Java inheritance. The advantage of this is that an application that can handle Address will be able to function properly even if it receives an instance of type ShipToAddress. This enables us to migrate applications to new document types independently from when document type extensions are introduced. This is a key requirement for global electronic commerce communities, which is the primary application domain for Commerce One. |
| Thirdly, the programming model is intuitive in the sense that types have the same names and structure as their corresponding element types. Applications need not traverse generic tree structures or have logic to parse incoming event streams, which simplifies the application programming task significantly. |
| Finally, name spaces in SOX are mapped to name spaces in Java. As an example, element names and attribute names represents different name spaces within a SOX element type definition. They are therefore mapped to distinct interfaces in Java. Likewise, a SOX schema represents a name space. This name space is realized by mapping the schema to a package. |
| The definition of a mapping such asX2J involves numerous compromises, just like any software design. In theX2J case we have been guided by certain design principles, some of which were presented earlier in this paper. As an example, the mapping of set and get methods for an exactly once element is the same as the mapping of zero or once element. There is a fully valid argument that states that a change of an element from being required to being optional should reflect in the signature of the corresponding Java get and set methods. On the other hand, the change in the schema can be considered small, and actually a weaker programming contract requirement since an application developed to required elements always satisfies the semantics of optional elements. Therefore, one could argue that this change should not require application programmers to change their code. This argument is not, however, valid going from optional to required elements. Another argument in favor of keeping the same method names is the fact that the mapping can not provide any guarantees until the marshalling is done anyway. A method name convention would have limited value over a generated comment reminding the programmer of the semantics. |
| In this situation we decided to follow the continuity principle. This principle suggests that a small change in the schema should be reflected by a small change in the mapping. Consequently, we decided that changing the occurrence from beingexactly once tozero or once should not change the programming interface at all inX2J . This is only one example of many very detailed questions that must be considered in defining a mapping such asX2J . |
Concluding Remarks |
| Commerce One provides the MarketSite product for establishing flexible and evolvable electronic commerce marketplaces. Commerce One has made a strategic commitment to provide open solutions by using widely available technologies such as XML. We have also noted that XML in itself is not sufficient for achieving the characteristics we desire for MarketSite. To address this, Commerce One has developed a new XML schema language [SOX], a new Java programming model for XML [X2J] and new XML processing components that are incorporated as a part of the MarketSite electronic commerce platform. |
| The XML processing component allows applications to validate instances to SOX and getX2J ---as well as SAX and DOM---representations of XML instance documents. The components also enable evolvability through extensions. X2J allows extensions to document types and applications to be decoupled. To supportX2J , Commerce One provides a compiler that generates theX2J mapping from the SOX XML Schemas provided as input. The generated code works efficiently with the MarketSite XML processing components. |
| We believe aX2J Java programming model for Schema based XML applications will provide the easiest and safest programming model for evolving electronic commerce market place systems. We also view this programming model as important in satisfying requirements for both reliability and extensibility that a global electronic commerce platform will imply. |
References |
|
| The Future is Today: Case Studies in Innovation | Table of contents | Indexes | XML and Enterprise Application Integration | |||