Saturn Technical Publications: A Case Study with a Customer Focus   Table of contents   Indexes   E-commerce Standards in the Insurance Industry

Boston
Hamscher, Walter
 PricewaterhouseCoopers LLP 
 USA 
 
Walter Hamscher
 Director, Strategic Technology Services
PricewaterhouseCoopers LLP
  160 Federal Street Boston (Massachusetts)  USA (02110)
Email: Walter.Hamscher@us.pwcglobal.com Web site:http://www.pwcglobal.com
 Biography
 Walter Hamscher is a Director in the Strategic Technology Services practice of the PricewaterhouseCoopers Global Technology Centre. Walter specializes in technology analysis and strategy for companies in the converging arena of technology, information, communication and entertainment. Walter received his Ph.D. in Electrical Engineering and Computer Science from the Massachusetts Institute of Technology and is the former Director of Global Research and Development for the legacy firm Price Waterhouse.
 

Introduction

 The simplicity and power of XML as a markup language has led to a number of efforts across PricewaterhouseCoopers in which XML plays a critical role, enabling the publication of objects that have the dual character of text meant for presentation and of data meant for consumption by programs. These efforts include a markup language for describing company financial performance information, a language for describing complex financial instruments, and a language for information presented and used in energy related industries. This paper is about ways in which companies are building not only better document systems, but also new businesses based on open standards for publishing documents, and tools. The talk will cover some pitfalls and a few lessons learned, along with a forecast for the future evolution of XML.
 

Point of view

 PricewaterhouseCoopers LLP is the world's largest professional services firm, with over 150,000 professional staff providing audit, business advisory services, business process outsourcing, financial advisory, HR solutions, management consulting, tax, and legal services worldwide. Consequently, we view XML both from the perspective of information users and as systems integrators. PwC does use XML in the creation and delivery of internal publications; however, the particular perspective of this paper will be systems integration: how we are leveraging XML and participating in XML-based standards efforts to the benefit of our clients in various industries including financial services, energy, and publishing. Given that focus, it is important to remember that "systems integration" really is literally that: the integration of systems. Some of those systems may already exist, some are new systems built from packages, some require custom software development, and in every single case, there are new business processes and often, new business models motivating the creation of those systems and enabled by them. For PricewaterhouseCoopers, XML-based standards are fundamental enablers of systems integration. XML is key not only at the "micro" level of building browser-based custom applications, but at the level of integrating enterprise data from multiple sources, managing content for publication in multiple forms, and exchanging data over the Internet to create new business models.
 

Financial Reporting

 

What's the problem?

 The financial performance of any business entity has to be periodically reported to its owners and other interested parties. The annual report of a public company to its shareholders is probably the most familiar example, but there are a wide variety of different entities-private companies, non-profit organizations, government units, mutual funds-and a wide variety of users for the information-investors, tax authorities, association members, government statisticians. Professional organizations, primarily the AICPA (American Institute of Certified Public Accountants), actively set standards that define what should be disclosed, the meaning of the terminology used in disclosures, and the way in which the information presented should be structured so as to be fair, transparent, and verifiable. {verify with AICPA site}. Practically since the invention of money, there has been the need to "create once" a set of financial records, "publish many" different summary presentations of those records, and for users of the information to be able to make meaningful comparisons between financial reports from different entities.
 

How do XML based standards provide a solution?

 XFRML (XML-based Financial Reporting Markup Language) is an XML-based framework whose vocabulary of elements and attributes reflects AICPA accounting standards and defines content models, documents and style sheets that facilitate the exchange and presentation of financial information based on those standards. A fundamental goal of XFRML is to enable "create once, publish many" and to reduce the overall effort of different parties involved in preparing and reformulating financial statements for different purposes. To understand XFRML itself, it is important to understand that there are three main structural forms appearing in any financial report: tables, notes, and sectioned formatted text.
 At the most basic level, XFRML defines financial tables of various types-balance sheets, income statements-with content models that define the possible line items, subtotaling and other relationships among them. These content models assign attributes to the individual reported numbers so that queries from analytic applications can reliably extract a given number from an XFRML document. A design goal of XFRML is that (for example) a software application that requires the "Cash" figure for the most recently reported period of a given business entity should be able to extract from an XFRML document the number, its scale (thousands or millions), currency, period, and so forth. Because accounting standards generally require presentation of multiple time periods so as to facilitate historical comparison, financial content itself seems naturally tabular. The tabular style is obviously also familiar to the people who might have to create the content using a text editor. Style sheets can then render the tabular data in any number of different table formats and styles, to any level of financial detail actually present in the original XFRML document source. There are a number of widely used financial information tables, including Balance Sheets, Income Statements, Cash Flow statements, and Tax reconciliation tables, that XFRML will have content models for, thereby facilitating the building of preparation tools that can validate the results to some extent.
 The "notes" are an integral part of a financial report, since they disclose necessary additional information about the data, including facts such as how much of the receivables are assumed to be not collectable, write-offs of assets, restatements of prior period amounts, or extraordinary items. Although the presentation of these notes at first appears at first to be relatively unstructured-just footnotes of free text, sometimes including subsidiary tables-there are constraints and conventions over what is presented in the notes that also lends itself to formulation in XML. For example, if the notes disclose that there is a 4% reserve against the receivables, this is a piece of data that can be tagged in such a way as to make it just as easily accessible to a software application as the receivables figure in the balance sheet itself.
 Finally, a substantial portion of a financial report consists of text which, although required--the description of the business, the management's discussion of the period's results, discussion of pending litigation--is free-form and cannot be structured in detail except to indicate units such as section and its type. Although the design work is in progress, using XHTML for this material is a likely outcome.
 There are other existing electronic formats for financial reporting. In the United States, public companies must submit SGML-formatted electronic filings to the SEC (Securities and Exchange Commission) using the EDGAR (Electronic Data Gathering, Analysis and Reporting) system. These SGML filings are then redistributed via the Internet. The primary purpose of EDGAR filings is regulatory, since the SEC represents the interest of the investing public in ensuring that financial statements are accurate and timely. One of the primary EDGAR filing types is the 10-K annual report; today the 10-K filing contains required SGML tags indicating key financial numbers, the body of the 10-K can be formatted using HTML, and supporting material can be attached in PDF. The EDGAR DTD already includes a vocabulary of line items, which will map directly to some of the line items in XFRML, so that a valid EDGAR filing can be generated from an XFRML financial report. However, XFRML is meant to serve purposes that the EDGAR system was not designed for. The richer semantic markup of tables, notes and body in a financial report places more demands on the preparer than EDGAR does, but allows for the automation of reporting not only from US public companies to regulators, but from any business entity to investors, tax authorities, and other users. There are also EDI formats for reporting financial information, including ANSI X12 transaction set 821 and UN EDIFACT INFENT. One of the key distinctions between XFRML and these EDI formats is that XFRML is silent on issues of routing and transport: a valid XFRML document is not only party-neutral, but it also limits its use of controlled vocabularies to those few most essential to financial reporting.
 

So what?

 XFRML illustrates all three of the general implications of XML-based standards that were enumerated earlier. First, integration of data from multiple sources: the module of the XFRML standard that encompasses financial tables and numbers could be used as an export format for multiple subsidiaries of a company, as the first step in integrating the data for consolidated reporting. Second, publication of content in different forms: the same underlying XFRML document can be styled to present different document forms that will filter, summarize, or reorganize information along any dimension available. Third, enabling new business models: the opportunity to capture detailed and timely financial information structured with XFRML makes it easier for new business to add value and publish analyses and renderings of content aggregated from many company sources.
 

Financial Products

 

What's the problem?

 Most people are familiar with the highly automated nature of stock exchanges and electronic trading networks such as the New York Stock Exchange and NASDAQ, and consequently might be surprised to learn that many other types of financial exchanges have not reached nearly the same levels of automation. The trading of financial derivatives, foreign exchange deals, and interest rate relies on telephone calls, faxes, and e-mail, and is largely executed through one-to-one relationships between dealers. Recently, secure Internet based protocols such as FIX (Financial Information Exchange) have emerged that allow dealers to broadcast offers and bids and allows software to automatically execute the deals, thereby creating a more liquid, more efficient market. This works well for ordinary securities, because transactions are relatively simple: some quantity of money and securities are held by a party who wants to trade them for some other specific product at specified price. However, with derivatives, the product is often "synthetic", consisting of a custom designed and arbitrarily complex package of options to buy or sell at some future date, swaps between instruments with detailed underlying risk and return profiles, and dependencies between the different elements of the whole package. Therefore, in order to automate such markets, it is necessary, at a minimum, to have a language in which such complex financial instruments can be precisely described.
 

How do XML based standards provide a solution?

 FpML (Financial Products Markup Language) is an XML-based language that is able to describe many financial instruments, particularly in the areas of foreign currency exchange, interest rate swaps, and derivatives. The focus of FpML is on the description of a proposed trade, and consequently the standard is defined in terms of modules (in the XHTML sense) that represent information such as who the deal parties are, the instruments encompassed in the deal, details about rates, cash payments, and contingencies. It differs significantly from FIX because it is independent of any particular transport or messaging protocol; indeed, FpML can be transported within FIX. FpML will be presented in greater detail in a different presentation at this conference.
 

So what?

 FpML, like XFRML, illustrates the three general implications of XML-based standards. First, integration of data from multiple sources: FpML makes it possible for any number of different deal execution and analysis systems within a given organization to publish their fundamental business events--their deals--as richly structured objects with a common syntax and semantics. Second, publication of content in different forms: many different perspectives on any given trade are necessary, including the profitability, risk management and regulatory perspectives. FpML allows the object to be transported, transformed and presented in different ways to different users. Third, enabling new business models: FpML presents the opportunity to create entirely new intermediaries who could manage exchanges of different kinds of financial instruments, thus enabling what is today an essentially one-to-one activity to become a more efficient, dynamic many-to-many marketplace.
 

Enterprise Integration

 

What's the problem?

 In any company, even the most comprehensive integrated enterprise wide systems must interoperate with other applications within and outside of the company itself. Enabling web-based e-Business applications to interface with legacy systems is one of the most compelling situations. However, any approach to integration based on periodic batch transfers of many transactions between systems is likely to be inadequate to meet the needs of today's enterprise for timely responses to new orders, changes in the availability of supplies, changes in customer profile information, and production changes. Today, the favored enterprise integration paradigm is loose coupling of independent systems that communicate in near real time by passing messages to each other in response to business events. Even the most sophisticated message handling middleware, however, is only a partial solution: enterprises must agree upon the content of the messages themselves, with a vocabulary of both "nouns" (customers, products, employees) and "verbs" (place an order, remove from inventory, receive a payment) to describe a meaningful business process.
 

How do XML based standards provide a solution?

 XML schemas and DTDs are today's best medium for formalizing agreement on the nouns, verbs, and grammar of messages to interface different enterprise systems and applications. One of the most extensive such frameworks is OAGIS (Open Applications Group Interface Specification) in which the message routing syntax and over 100 different sets of related business object document types are described using XML DTDs. Enterprise software vendors including SAP and PeopleSoft have interfaces based on these specifications; PricewaterhouseCoopers uses these specifications to implement integrated, enterprise-wide solutions. Another paper at this conference discusses enterprise integration and OAGIS in greater detail.
 Work in progress at the COM for Energy Foundation is motivated by somewhat different needs unique to the oil and gas industries. In this industry, "upstream" engineering applications for managing exploration and drilling must interface with financial applications that manage risk and track royalties, and with ERP systems that manage logistics, production, and "downstream" delivery. Upstream, the applications and their data resemble CAD models, with richly annotated, hierarchical structures; downstream the applications deal with financial data that is regular and tabular; throughout the systems there are many kinds of documents that mix formatted text with other structures. The inherent complexity of any integration effort is magnified in the energy industry because of the many different vocabularies (scientific, engineering, financial, etc.) applied to each element of the complex supply chain, and the complexity of business interactions as each enterprise attempts to optimize risk and profit. XML is key to the Energy Integration Platform (EIP) that PricewaterhouseCoopers is building; EIP allows objects (drilling sites, reservoirs, equipment) to be described in rich detail in an application-independent way, and then used in all the message payloads that connect applications residing on different software platforms.
 

So what?

 OAGIS and the EIP both illustrate how XML enables communities of software vendors, users, and integrators to isolate issues of transport and message routing and allow a focus on the information needing to be exchanged. This enables those communities to make progress on common interface specifications more rapidly than in the past and enables information trading communities to thereby expand more rapidly.
 

Technology Evolution

 Technology forecasting is a tricky business, since it is intimately bound up with business needs, economics, and psychologys. Moore's Law, concerning the density of transistors on silicon, is about the only forecast one can make with great reliability. Hence, processing power is getting cheaper faster than the cost of storage; furthermore, because the cost of network transmission is limited by processing, not storage, we expect data transmission costs to fall at about the rate of processing costs. Regarding other information technologies, as a general rule, technologies evolve to dominance when it is credited with providing a differential advantage to those who use it. Natural selection then takes over and the survivors will be familiar with the technology they adopted and not with others. In combination with Metcalfe's Law, which says that the value of networking technologies increases as the square of the number of users, one sees a very fast moving phenomenon, in which an initial small advantage to early adopters spreads rapidly and is magnified into near universal adoption. XML-based standards are certainly benefiting from this effect and we expect that to continue for several more years--a forecast that is probably as safe as forecasting that Moore's Law will not be repealed.
 A key feature of XML is that it can be used to serialize any data object in a verbose, self-describing, relatively order-independent fashion. This is good, but one of the consequences of this is that XML provides no differential advantage in situations requiring the batch transfer of massive data files between well-known origination and consuming systems. In situations like that, comma-delimited text and fixed-length data formats really don't actually look too bad. Of course, the batch transfer of massive data files is not what e-Business or enterprise integration or content repurposing require; these require continuous update messages (using, e.g., BizTalk), nested structures that summarize other content (using, e.g., CDF or ICE), or dynamic assembly and rendering (using, e.g., JSP, ASP, SMIL). Companies will dominate who are successful at connecting their applications to other companies' applications, and connecting their own existing applications to each other, in ways that make them react in near real time, and using message formats that can be examined and debugged without special tools when things go wrong.
 XML, like any really useful programming abstraction, can unfortunately lead designers to misjudge the efficiency implications of their designs, and some systems based on XML will fail to perform adequately in terms of efficiency and therefore scalability. XML is a fine way to move information between systems, but in nearly all scenarios, it is necessary to map the XML back into a relational or an object model for storage and manipulation; hence, our databases need to support XML natively, including efficient, built-in validation, revalidation, search, etc. Tools that do this efficiently are still maturing, and are still in some cases under development, so that most corporate IT organizations do not have the experience to make confident decisions about this vital aspect of XML-based systems. The industry lacks any XML-oriented performance benchmarks from an independent organization, a necessary condition for intelligent decisions about really large-scale deployments. Once the tools mature, it will be extremely hard for any new software application to justify using a non-XML based format, or some sort of XML import and export format. However, this is still some ways away; in the meantime the "XML format" will in many cases consist of a proprietary DTD or Schema, probably with Base64 encoded data embedded in it, ostensibly for reasons of efficiency.
 Perhaps the most important thing about XML is that it can raise the level of discourse in conversations about systems integration, so that the occasionally mysterious concept of meta data is now more accessible, and in fact has a tangible manifestation that everyone can agree on. Meta data, data modeling, and schema definitions are moving to center stage. The notion of DTD-free XML documents will fade, because although it is interesting that the syntax of XML is so simple that programs can almost mechanically infer the DTD from document instances, document types with well-defined schemas will be so much easier to write programs for that they will dominate. Reusability of schema fragments, as postulated in the modules of XHTML, is an important goal, but is unlikely to see much actual employment in the next few years--not even for XHTML, since it will take some years before XHTML is fully supported by creation and rendering applications. One blocking factor is that multiple competing schema standards will emerge, and it seems very unlikely that an XML document will be able to usefully import definitions from schemas that were themselves defined using different schema languages. Another blocking factor is that schema have to stabilize, be registered, and used by multiple applications before it is really safe to start building other schema around them. Namespaces alone seem inadequate for combining fragments, especially fragments which might have multiple versions, since namespaces alone don't provide mechanisms for transitive inheritance, overriding of defaults, or other features that support compositionality in other languages. However, the fact that it is uncertain exactly when the specifications will settle, when there is a large body of schema to share will exist, and when the tools mature, the direction is clear, and commitment to XML-based standards has clear benefits.
 

Selected References

 
  •  COM for Energy: www.com4energy.org
  •  E-Business Technology Forecast: www.pwcglobal.com
  •  FpML: www.fpml.org
  •  OAG Business Object Documents: www.openapplications.org
  •  XFRML: www.xfrml.org

Saturn Technical Publications: A Case Study with a Customer Focus   Table of contents   Indexes   E-commerce Standards in the Insurance Industry