NewsML   Table of contents   Indexes   Metadata Workflows

 Metadata 
 

PRISM

 e-commerce through metadata
 Burman, Linda 
 
 Linda  Burman
 President
  Canada 
 L. A. Burman Associates 
 Ontario 
 Toronto 
L. A. Burman Associates,  Co-Chair, PRISM Working Group,  23 Hambly Ave., Suite 300
Toronto  Ontario  M4E 2R5 Canada
Phone: 416.699.7198 Fax: 416.699.1178 email: linda@laburman.com web site: www.laburman.com
 Biography
 Linda Burman - Linda Burman is President and CEO of L. A. Burman Associates, Co-chair and Founder of the PRISM (Publishing Requirements for Industry Standard Metadata) Working Group, Co-author ofMastering XML (Sybex, 1999), Manager of Industry Relations for the XML Handbook, and XML Instructor at the University of Toronto.
 L. A. Burman Associates is an XML consulting firm founded in 1994 to leverage the phenomenal growth in online publishing and the information industry. The company services large publishers, software/hardware vendors, non-profit consortia such as the GCA (and IDEAlliance) and OASIS, and the investment community. L. A. Burman Associates' expertise covers a wide range of activities including XML standards development, strategic market and business planning/implementation, business development, investigation for due diligence, industry and software analysis, sponsorship programs and training.
 Prior to starting her own company, Burman was Director of Worldwide Marketing at SoftQuad International, and before that she was the Publishing Evangelist at Apple Computer (Cupertino). She has also held technical and business roles at other hardware and software companies in data communications and publishing.
 Ms. Burman is recognized as an XML industry expert She also sits on the Advisory Councils of Foundry Ventures Inc and the Baycrest Centre for Geriatric Care. And in her former life she taught high school.
 Abstract
 The PRISM Working Group is sponsored and hosted by IDEAlliance, a.k.a. GCARI, the Research Institute of the Graphics Communications Association (GCA). It was founded by the author in response to a critical need in the marketplace for an extensible, XML metadata standard for repurposing, aggregating, syndicating, personalizing and post-processing magazine, catalog, book, news and mainstream journal content. This paper describes the business drivers for the PRISM Working Group members, some of the challenges that the group has encountered along the way and the progress and achievements to date.
PRISM
 

PRISM: metadata to improve the bottom line

 

Rationale for joining the Working Group: The Biz Problems

 Content providers of all types are faced with continuous challenges to create new revenue vehicles and save money with existing ones. They need to capture, manipulate, combine, shred, protect, manage, personalize, post-process and recombine content for multiple media without invoking the labor-intensive, time consuming processes of today. Consumers expect content to be delivered on print media, hand-helds, mobile devices, screens, and kiosks, formatted on-the-fly from one source - in a number of formats - if not gorgeous!
 Changing technology is driving the creation of a new business environment. There are new kinds of business relationships, new technological requirements and more advanced technologies, new methods for sharing content, and totally new types of content such as hybrids from multiple sources for print, streaming media and interactive. For instance, previously unprecedented kinds of business relationships like the alliance between MSNBC and the Washington Post require fast, accurate automated sharing of large volumes of data in multiple formats - in real time.
 Today, in most shops, repurposing consists mostly of cutting and pasting since there is no reliable way to automatically retrieve similar types of content. Also, lack of agreement among publishers on how to describe various pieces of content makes aggregation very difficult. For instance, when Getty Images acquires a new company, all of the new images must be integrated into the existing collection. This is an enormous task today because there is no consistency of metadata! And what makes it even worse is that there is no common language among software tools that create, store and manipulate these pieces of content.
 Syndication is also a challenge. Although a standard XML communications protocol for syndication called ICE (Information Content & Exchange) has been developed, there is no standard way to automatically describe the data that is being syndicated.
 Publishers, aggregators, syndicators and any kind of “re-publishers” must also have access to the rights and permissions associated with each content component - not just at the document level. And that information must be availablemostly automatically. Some of the kinds of restrictions that need to be considered are geographic, time, language, market, format, alterations, and exclusive use. Knowledge is also required about use of freelancers’ work, for instance, and whether content can be used on a partner’s website and so on.
 Today, rights and permissions are dealt mostly via phone calls, pencil and paper, email, faxes and paper contracts via mail. These processes are very labor-intensive and highly unreliable since, for instance, one person may quote one set of permissions and prices based on his/her research and another person may discover something quite different.
 Another issue that frustrates publishers and consumers alike is that there is no way to find separate pieces of content that were published together at one point in time, such as the New York Times special supplement. The reason is content components carry no usage history information. Similarly, there is no way to find letters of correction and subsequent articles today because the relationships are not referenced. Nor are relationships within a particular article tracked.
 The hopes of resolving at least some of these barriers were the drivers that brought the PRISM Working Group members together.
 

Goals

 Initial Primary Goal: to develop an XML metadata vocabulary by mid 2000 specifically for the magazine, catalogue, mainstream journal, news and book industries.
 
  • Publishers must agree on a set of metadata descriptions so that when they send content to each other over the web using standard communications protocols (one of which is ICE), both the sender and receiver will know what they’re getting.
  •  
  • The metadata vocabulary will also make it possible for users of aggregation sites to find information in a reliable way and for the aggregators themselves to manage that content.
  •  
  • Individual publishers will also have a standard-but-extensible vehicle for managing their own re-use of information across their own content sources.
  •  
  • Software tool vendors can then incorporate support for this standard metadata vocabulary to provide off-the-shelf tools without complicated integration.
  •  
  • The metadata vocabulary must also work for archiving and search and retrieval.
  •  
  • The vocabulary must be implementable when it is delivered.
  •  
  • The metadata vocabulary will leverage and reference related standards and technologies as possible and as appropriate, such asRDF, the Dublin Core, NITF, NewsML, DOI, INDECS, ICE, DPRL and so on.
  •  
  • Alliances with other standards groups must be built to ensure a flow of information and to encourage adoption of PRISM by other initiatives.
  •  

    The Working Group

     Modeled on the successful ICE Authoring Group, which is also hosted by IDEAlliance, the PRISM Working Group is made up of a combination of content providers, integrators and software tool vendors. The thought process goes something like:
     
     If there are no tools, there will be little adoption. If the vocabulary is not the right vocabulary - if it does not work for diverse types of publishing - there will be little adoption.
     At its inception, content providers and software vendors could join the Working Group by invitation only, because it was crucial to provide the right mix of vendors and providers and it was also critical to cover a variety of types of publishing and software tool functions. However, today, any interested company may join as long as it makes the commitment to assign the appropriate resources. Another level of membership is also being established called theNetwork Member Level . Network members will be able to track progress, provide feedback (and get) and get early access to specifications.
     Working Group members currently include:
     
  • Adobe Systems
  •  
  • Artesia Technologies
  •  
  • Banta Integrated Media
  •  
  • Cahners Business Information
  •  
  • Condé Nast Publications
  •  
  • Getty Images
  •  
  • iCopyright.com
  •  
  • International Data Group/ITworld
  •  
  • Kinecta
  •  
  • KPMG
  •  
  • MarketSoft
  •  
  • Metacode Technologies
  •  
  • Quark Inc
  •  
  • Sothebys.com
  •  
  • Time Inc.
  •  
  • Vignette Corp
  •  
  • Wavo Corp
  •  

    Timelines

     The aggressive milestones that the group established called for the release of a specification by mid 2000. At the time of writing, the work is on track to produce the first version of the specification in that timeframe.
     

    PRISM’s Progress

     

    Background

     The Working Group began meeting in June 1999 and has continued to meet every month since then - although some meetings involve only subcommittees.
     Creating a new standard is an exciting but challenging activity, especially in a space that is already inhabited by other specifications, all of which need to be examined, understood, leveraged and/or referenced. Thus the Working Group proceeded in two directions, initially. The Group decided that the requirements document would be a set of very specific “use case” scenarios describing business problems that content providers need to solve. These scenarios have proved extremely useful for determining scope and actual vocabulary requirements. As new content providers join the Group, they provide additional scenarios to ensure that the group is creating the ‘right” vocabulary.
     The Group researched and analyzed existing standards and initiatives to determine which aspects were not being addressed. The Group’s specific goal was to avoid “re-inventing the wheel” - to make use of existing specifications where possible. A secondary goal was publish a catalogue describing existing specifications and how they relate to PRISM.
     During this research phase it became evident that some of PRISM’s interests coincided with those of the IPTC, who had developed NITF (News Industry Text Format) and were in the process of developing NewsML. Thus a collaborative relationship was crafted and representatives from each group have been attending the other group’s meetings since January. The IPTC has already developed some aspects of metadata that are very valuable to PRISM. But the IPTC’s work is not as focused on component and general relationships, rights and permissions and other requirements that are more specific to magazines and catalogues.
     

    Getting the work done

     PRISM is using a supply chain framework to delineate various aspects of required metadata. Objects consisting of any/all media type(s) are captured and maybe archived, and then “passed” to another process for manipulation and aggregation into a compound object, such as a magazine article. The compound objects are then delivered to multiple media in multiple formats. They may then be re-stored as individual components or passed outside the enterprise as compound objects for aggregation, syndication, re-use and retrieval by another business and/or by consumers. Post processing may occur at any step. Metadata to facilitate these processes with the exception, probably, of delivery, are the concerns of PRISM. Metadata to manage content objects, to describe their rights and permissions and the relationships between and among them, is also the concern of PRISM.
     The Working Group is developing a framework for the metadata vocabulary and expects to use RDF to describe the relationships of components. The Group has formed subcommittees devoted to each aspect described above. Subcommittee work is reviewed by the whole Working Group. PRISM requires consensus on all decisions.
     

    Achievements

     In February 2000, at theSeybold Seminars Conference in Boston, the Group presented an interoperability demo of a scenario involving seven software tools exchanging and operating on content tagged with PRISM metadata. The admiring audience was struck by the applicability of the technology demonstration to the issues they struggle with on a daily basis. The demonstration was repeated atXTech in San Jose later that month - again receiving very positive responses.
     At the time of writing work has proceeded on both vocabulary descriptions and framework definitions such that the Working Group expects to release version 1.0 of the specification in the June timeframe.
     In the exhibition area at this conference (XMLEurope 2000 ), the PRISM Working Group members are demonstrating collaborations between content providers and software tool makers using PRISM metadata.
     

    Summary

     In summary, it is clear that the industry needs a standard metadata vocabulary to realize the potential of online publishing and e-commerce in the publishing industry. PRISM provides a framework for the interchange and preservation of content and metadata. PRISM also provides a set of controlled vocabularies with which to describe the content being interchanged. Thus PRISM will provide a common interchange that greatly expands the market for licensed content.
     Acknowledgements
     The author wishes to thank the members of the PRISM Working Group for ongoing dedication to the specification and in particular, Deren Hansen of Wavo Corporation for his fine work as editor of the specification and Ron Daniel of Metacode Technologies for his leadership as co-chair of the Working Group.

    NewsML   Table of contents   Indexes   Metadata Workflows