An Object/Relational Approach to Content Management   Table of contents   Indexes   Grainger Connect

Banta Integrated Media
Rafal, Howard
Somerville
 USA 
 
Howard B. Rafal
 Senior Software Engineer
Banta Integrated Media
  122 Heath Street Somerville (Massachusetts)  USA (02145)
Email: hrafal@banta-im.com Web site:www.banta-im.com
 Biography
  Howard Rafal is a graduate of Worcester Polytechnic Institute (B.S. 1987, M.S. 1989). At Digital Equipment Corporation he worked on software engineering tools and an X.400 mail product. Later, he worked at BBN on various Internet products targeted at the K-12 education market. Most notably, Mentor Center which allows students to be matched with mentors anywhere on the Internet and get feedback on actual classroom work. This software can be seen at http://iss.ocmboces.org.
  Howard is currently a development manager for Banta Itegrated Media's Bmedia product. He is responsible for the product's XML Gateway, which provides an integration API to Bmedia's content management system.
  Special thanks to Eric Feingold for his excellent editorial work.
 

Introduction

 API, Application Programming Interface 
 
A development team faces many challenges when building complex software. To get a product out in a timely fashion, there have to be trade-offs between quality and features. Sometimes a team must scale back the feature set of a release to ensure better stability,. One way to inderectly add new features to a product is to create Application Programming Interfaces (APIs) that provide third parties the opportunity to create these features. In effect, this expands the development of the product without increasing in-house staff. The problem is that API development creates different challenges because they require a lot of thought and maintenance.
  The challenge of maintaining APIs is the centerpiece of this paper. In the discussion that follows, this paper demonstrates that by using standard protocols for API development, a development team can dramatically decrease the number of maintenance issues and, at the same time, build a product that draws from a vast store of available technologies to customize, create and present information. Critical to implementing APIs flexibly is XML. This paper emphasizes both the general benefits of XML and the very tangible, specific gains that XML can secure for a product.
 

Creating APIs

  Before a development team decides to allow third parties to integrate with its product, there are many factors to consider. How much support will each implementation require? How much development time will be necessary to create the interfaces? Would it be smarter to develop the features in-house?
  Traditionally, development teams implement APIs by building code libraries. The development team doing the integration (integrators) can call into these libraries by using specific function calls. Or, in the case of plug-in architectures, the API must describe the calls that integrators' code needs to make to link to the API at run time. Both of these techniques require the integrators to have specific programming expertise and a fairly deep knowledge of the core product.
  Another approach to creating APIs is to use standard representations, like XML to convey product information and let integrators work with the wide variety of existing interfaces to build custom solutions. This section details the considerations for creating APIs by exploring both API Libraries and standards-based APIs.
 

API Libraries

  As mentioned above, creating code libraries requires the development of calling structures to retrieve information from a product or a set of called interfaces that an integrator must implement. Both of these integration techniques have similar issues, since they require specific coding structures defined by the product.
 

Platforms

  Most products must support many different platforms. Once interfaces are created for one platform, the need to implement them for other platforms arises soon after. An API team can spend a great deal of time just porting the same interfaces from one platform to another and fall behind on adding new features. Strategies for managing the many variants of code are critical. In a very short time, what began as an effecient way to gain new features through custom development from third parties, becomes more complex and time-consuming than just creating the desired features in-house.
 

Languages

  Many times, integration is done with products that already exist. These products may be using a particular language, making it much more convenient to work in that language than move to another. An API team may decide to try and solve the platform issue by choosing the Java language for implementation. This allows the API reside on many different platforms without a significant amount of customization. However, because an existing product my not use Java, a Java implementation may not work. In addition, languages come and go fairly often, making it necessary to have the same APIs available in multiple languages.
  The support of different languages is very similar to the support of different platforms. A team can be burdened by chasing after the 'right' language for integration. What seems like a great choice today, may result in an unused API by the time it is released tomorrow.
 

Release Management

  By developing code libraries for integration, an organization is committing to keep these libraries synchronized with the main product. This means that every release of the product requires update to the integration library. If an integrator needs a feature of the product and it is not defined in the API, integrators will fail.
  The result is that the team needs to manage the release of these libraries and keep them up to date. Furthermore, integrators need to install the latest libraries and implement to the current calls to stay on top of releases. Backward compatibility can be ensured by the API team, but that doesn't help the integrator with this two step update procedure when integrating with new features.
 

Documentation

  To enable integrators to work with APIs, they need to understand how calls into the system work. This means that each call must be well documented and complete examples must be given to show how calls are combined into a system. So, in addition to developing and managing code libraries, there needs to be a way to effectively communicate these APIs to third party developers.
  There are many methods for describing calls into an API. Usually calls are listed and their parameters described in the programming language of the API. Developers who know the language have a fairly easy time understanding the calls, but the nuances to the actions of these calls may be convoluted and difficult to grasp. Therefore, it is wise to document the practical utility of a call and provide relevant scenarios as well as sample code. APIs must be developed so they can be described simply. If the architecture for the API is consistent and simple, then documentation can assist the integrator in building functionality immediately.
 

Training

  If the structure and documentation for the API are clear, every integrator could start building custom solutions right away. However, no matter how accessible the API is, complex systems still require training, and integrating with these systems may require even more training. Materials can range from demos of how to use the APIs to hands-on courses that take potential integrators through the process. The level of integrator support dictates the type of training necessary. If there are only a few integrators, you may just sit down with them directly and get them started. If there are thousands of integrators, then you can afford to develop courses that people pay to take.
 

Standard Protocols as APIs

  Another approach to delivering APIs is to use standards to let integrators connect with a system. Many of the same issues aassociated with API libraries may arise with this approach, but the impact on the API developers is lessened dramatically.
 

Platforms

  Eventually, somebody working on a particular platform decides to use a standard for some work. Over time, many people post their implementations of the standard and one emerges as the one common to all developers who use this platform. If the standard is an essential one, the company who develops the platform may even release an implementation for this standard. Over time, many people use these implementations and provide feedback so that even the implementation itself becomes a standard.
  This is helpful for the API developer because the implementation of the standard is not maintained or shipped by the in-house development team. In addition, development teams implementaing the API can use appropriate libraries just like integrators. For the integrator, this is great because they are ensured that their platform will be supported by the API as long as the standard implementation exists.
 

Languages

  Good, standard protocols tend to be implemented in every language even in very early stages. By the time something starts to become a standard, there are usually many developers who readily share their implementations of these standards. In addition, the organizations driving the standard wish to see their work used, so also push to release on many platforms and languages. In essence, the standards committees actually go through the maintenance headaches for you, so you don't have to deal with this. This means that developers of APIs and integrators can choose their language appropriately without having to wait for the API developer to create the API library for the integrator's chosen language.
 

Release Management

  Since no code libraries are shipped, the release management is reduced to normal product shipping issues. As the product is updated, the APIs must be updated as well. As more features are developed, they can be documented. Integrators can then update their code to take advantage of new features, but don't have to worry about issues that arise with new versions of code libraries (incompatible calls, changes in opaque structures, etc). The API developers can use tricks of the standard to ensure compatibility. For instance, if a new XML tag is defined, the integrator can just ignore it. The user will lose the benefit of this tag, but would not receive an error.
 

Documentation

  Just like with code libraries, standards-based APIs require heavy documentation. These APIs still have calling structures that facilitate data being returned. XML is a particularly good standard for documentation because everything can be defined with a DTD. The calling mechanism is fully defined and the possible results are also defined. This means that the integrator is always working with understandable objects. All side-effects are side-effects of the underlying system, not the structures that are used to work with the system.
  To the API developer, effective documentation becomes the most difficult task, but even this task is lightened by having very well known descriptive models.
 

Training

  Similar to documentation, training is still a challenge. Howeverm it is somewhat alleviated by the fact that people will know the standard. Rather than training people on internal calling structures and method definition protocols, the training can focus on the particular functionality. There is a set language that people who understand the protocol will be able to grasp more readily.
 

Case Study

  Now that we have explored specific ways that standards can aid the development and integration process, it is always helpful to look at an example. For this purpose, we will look at a project that was prototyped in 1998 and started in January of 1999.
  At Banta Integrated Media (http://www.banta-im.com) we wished to provide integration support for our digital content management system, Bmedia. We needed this integration support for an internal software client that we were writing (our Quark XTension) and knew that this support could open up possibilities for further integration. We chose to look at standard protocols because of the advantages we saw in supporting an API this way.
  The first section describes the standards that we chose. The next section details the experiences we had in doing this sort of development. Once we did this work, we found many interesting opportunities were afforded by the standards we chose. This section wraps up with a few of the extensions to our capabilities that emerged from knowledge of these standards and from particular requirements that we have garnered.
 

Standard Protocols

  After looking at different standard protocols, our development team chose to work with the following standards: XML, MIME, and HTTP. This section describes these choices in more detail.
 

XML

  Bmedia is designed to let enterprises customize and build complex data structures for describing digital content of any format and then store and retrieve this content. In Bmedia terms, the database contains meta-data that describes the qualities of all assets trracked by the system regardless of what types of files the assets may be. With this in mind, we knew that an open-ended protocol would provide simple access to other products. We wanted a protocol that could express our complex objects and still leave the flexibility for custom data that our system provides.
  We chose XML because the information in Bmedia lends itself to a text-based representation. The nesting structures that this sort of markup provides gave us a flexibility to capture both small and large object structures. Attributes seemed to provide an ideal way for us to capture database ID information, while the actual text would support any user entered data. The most appealing reason to use XML was that we could develop a DTD and hand that to integrators.
  We chose to implement our own XML calling structure, but in the future we could use XML-RPC (http://www.xmlrpc.com/) for even more standardization. An integrator needs only to format an XML call and receives XML back.
  Bmedia is intended to support businesses with vast amounts of asset information. Many of these assets are very large files. Since XML binds characters (entities), like <, >, &, "E;, and ' to special meanings, any non markup data must identify these characters as unique from the markup. Trying to find and indentify all these entities within binary data would result in processing time for our API server and for integrators' clients. Our development team decided to have the XML identify binaries by unique keys and then use these keys in a MIME ID structure.
 

MIME

  Our API needs to handle large batch operations that may have many pieces of binary data as well as small meta-data queries with no binaries. After looking at a few choices for handling binary data, we chose MIME because it gave us a distinct separation of meta-data information from binaries. The XML for the binary components of an asset are tags that uniquely identify the MIME parts that contain the binaries for this asset. The identifiers map directly to Content-ID headers in the MIME. So, for example:
 
  •   <thumbnail>xxx:thumbnail</thumbnail>
  •   <preview>xxx:preview</preview>
  These lines state that there will be a thumbnail and a preview for this asset. The thumbnail will have a Content-ID header of xxx:thumbnail. As the client processes the MIME, thumbnails can now be retrieved and easily associated with its proper object.
  Now that we had a means of capturing all our data in an easy to describe way, we needed to choose our communication method. We knew that we would want to integrate with the World Wide Web at some point, so this choice was an easy one. We decided to use HTTP.
 

HTTP

  Since HTTP is so widely used, we knew many of its advantages. There are many available server implementations as well as extensive library code that integrators could use. Since our servers can accommodate multiple users, we particularly liked the nonpersistent nature of HTTP connections. This allows the gateway to support requests from many users with a very minimal amount of machine resources.
 

Experience

  With our choices for protocols made, we set out to build our integration product. We decided to keep this work fairly separate from our mainstream servers. To do this we built a gateway server that would interpret XML, turn it into calls to the Bmedia application server and translate results into XML.
  The first version of our gateway was built so we could build a Quark XPress XTension that could make use of Bmedia content information. The idea was to allow our customers to build print documents that are tied directly to images and meta-data contained in the database. The rest of this section describes our experiences with each end of the API development process, the construction of the API and the work required by third parties to integrate with these APIs.
 

Building our XML Gateway

Java Web Server
Servlet API
 
We chose Sun's Java Web Server and the Servlet API for building our XML Gateway. Sun's Java Web Server handles all the details of HTTP for us, so we did not have to write any communication code ourselves. The Servlet API takes care of session management using cookies, so we were able to use provided objects to store special user information and that was all we implemented for dealing with clients.
  Now that client software could connect to our servlet, we needed to build our MIME and XML processing. We downloaded Java packages for each of these protocols and quickly implemented our XML I/O. All that was left for us to write was the code that handles a user's request.
  The result of this was that we had end to end communication working in less than one month. We then had to refine our DTD, extend the feature set, and manage protocol issues with our integration project, our Bmedia Quark XTension.
 

Integration 1: The Bmedia XT

Quark XTension
 
A Quark XTension is written by coding to Quark's API. Quark's API library is written in C and our XT team used Metrowerks' PowerPlant framework for building the Macintosh interface for this XT in C++. PowerPlant has Internet classes that handle much of the MIME protocol. We downloaded a C XML parser and wrapped it in C++.
  So, just like writing the API using Java (which we develop and test on Macintosh, UNIX, and Windows), our integration took advantage of these standards on Macintosh using C++. After the XT interface was developed we spent three weeks integrating the client with the gateway. Most of the issues we encountered were during our development of our MIME and XML format, which were actually being refined during this integration. Presumably, any other integration with our current features would be much quicker because this markup is more stable. In fact, we write some simple Perl scripts in only a couple of days which can use the gateway to retrieve and import information.
 

Results

  Our work made a good case for implementing APIs which use standards. We have been able to focus on our own object structure for implementation and documentation. We did not have to spend a great deal of time making the underlying protocols work effectively. Even more importantly, XML has provided a rich way for us to transmit information in a flexible manner. We did not have to spend large amounts of time trying to work around little problems that arise when actually working with real data.
  XML also provided a quick way to demonstrate interesting technologies that will provide tangible payback in future versions. This section concludes by describing a couple of ideas that we demonstrated successfully and thereby educated others in possible new directions for our product and the development of our product.
 

Tester

Testing
 
We had a few months between the point at which the XML Gateway was first working and the point at which the Quark XTension was communicating to it.In the interim, we wanted to test our gateway implementation. So we wrote a stand alone Java application that implemented the Java Servlet API. This application takes input files in XML format, turns them into the proper MIME input, sends the input through the gateway servlet, and then creates the MIME output. We were then able to create a library of tests that can go through our gateway and hit our product servers.
  Recently, we took this tester application and created different types of tests. We are loading our system by running multiple instances of the tester. An even more interesting development for the tester was the realization that we could use XSL to process test results and extract internal database information. This means that we can build test scenarios that can run through a series of individual tests and use information that will change with different databases without having to rewrite our tests. This capability is proving to be very useful in making our whole product more robust.
 

Web Demo

 Web 
 XSL 
 
Many of our customers have a long history working with print media. These companies are developing strategies for getting their information to the Web. Many of the assets are the same, or at least similar, and much of the meta-data is exactly the same. Since our product is intended for large enterprises and our company does do a significant amount of Web-based work, we are now showing how managed data can be used across different media types.
  As we were developing the XML protocol, we wanted to keep an eye on using XSL to get to the Web. In two weeks, we selected an XSL processor and built a demonstration of Web sites being delivered directly from our Bmedia product. The demo shows how XSL can be used to deliver dynamic information in a consistent manner. Also, it shows how the same information can be delivered to multiple Web sites. By showing Bmedia, we can change information, show it being updated in a Quark publication and also show it updated on a variety of Web sites. This final step demonstrates clearly how these standards led us to a solution that creates opportunities rather than a great deal of extra work and support.
 

Conclusion

  When appropriate, leveraging standards for creating APIs can significantly ease many development issues. As standards emerge and clever uses of these standards are implemented, the APIs can rapidly become general purpose engines for new ideas. Our API development process has only just begun and we are already able to demonstrate a number of key benefits that the API can deliver. Over time, we look forward to seeing all the new implementations that our integrators will be able to build.

An Object/Relational Approach to Content Management   Table of contents   Indexes   Grainger Connect