XML and e-Commerce   Table of contents   Indexes   XML as Infrastructure in Internet Relationship Management

 

Streamlined Multimedia Production and Distribution Line

 Jeanne   El Andaloussi
  Director of Operations
  AIS 
Email: jela@ais.berger-levrault.fr Advanced Information Systems
15-17 rue Rémy Dumoncel
 F-75014 Paris   France
Phone: +33 1 40 64 43 00
Fax: +33 1 40 64 43 10
 
Biographical notice:
 AIS  
Dulong, Tanneguy
 France  
 Paris  
 

Jeanne El Andaloussi is Director of Operations and Training at AIS/Berger-Levrault in Paris, a systems integrator for editorial production systems based on XML/SGML. She has long experience in SGML training and corporate documentation standards, tools and methodologies. She is co-author of the book "Developing SGML DTDs from Text to Model to Markup" published by Prentice Hall in 1996.
 Tanneguy   Dulong
  Project Leader
  AIS 
Email: tdul@ais.berger-levrault.fr Advanced Information Systems
15-17 rue Rémy Dumoncel
 F-75014 Paris   France
Phone: +33 1 40 64 43 00
Fax: +33 1 40 64 43 10
 
Biographical notice:
 
Tanneguy Dulong is Project Leader at AIS/Berger-Levrault in Paris. He has the experience of the implementation of publishing processes based on SGML in the aircraft industry. He was involved in the described project from its very start.
 
ABSTRACT:
 
This session will present the Eurodelphes project, which was launched by the European Community in February 1998. This project aims at allowing:
  •  providers of huge video resources to index those resources, to subtitle the video sequences in several languages, and to offer these resources to commercial publishers ;
  •  commercial publishers to author interactive electronic books using those video resources as illustrations ;
  •  users to browse those interactive books on the Web, providing they have the video resources locally (local server loaded through DVDs).
 
This is not the first multimedia production and distribution line organized, but it is one of the first making use of XML, not only to handle text, but also to handle video.
 

Some context

 

Goals

 
The issues presented in this paper have been tackled during the conception and development of the Eurodelphes project to date. This project is supposed to end in June 2000.
 
It aims at building a multimedia production and distribution line, focused on the use of large amounts of good quality video, and its impact on production processes and tools.
 
The proof of concept of this project is based on building an interactive handbook on the history of the XXth century (which is the only one to have video archives). The focus of this paper is on the technology.
 

Actors

 
The consortium in charge of the project brings together partners for each phase of the production line:
  •  the archive providers (INA, RAI, SWR);
  •  the multimedia publishers (Nathan, Klett, Giunti);
  •  the pedagogical team (Paris VII, Centre Tudor, FWU);
  •  the technical team (AIS, BPS, DTAG, INA, Giunti, TubKom, Ponton).
 
For each phase partners come from France, Italy and Germany.
 
The differences in technical cultures is a major difficulty. Partners have so little insight on the actual job of other actors that they don't understand their constraints. For example, the calendar of publishers working for public education is based on the school year, which allows terribly short delays for the implementation of the required tools.
 

Some figures and facts

 
  •  25 to 30 hours of AV documents from three countries fully indexed
  •  synchronized subtitles of all the sequences in three languages
  •  six partners authoring documents and putting cross-document links
  •  fifty people involved
 

Architecture and Implentation

 

A three step design

 
The architecture of the multimedia production line includes three functional blocks
 
 
  1.  Archive providers document AV material in XML so as to produce a catalog of AV resources.
     In the catalog, each AV document falls into sequences and each sequence is:
    •  Identified with an ID;
    •  Defined in start time within the document and duration;
    •  Described with a text substitute;
    •  Indexed with keywords (from the shared thesaurus);
    •  Subtitled in three languages.
     Each AV document is also identified by an ID and a collection of metadata, useful for selection by the publishers and search by the final users:
    •  Document type (report, interview,…)
    •  Date of events
    •  Publication date
    •  Famous individuals
    •  Location
    •  Content provider (INA, RAI,…)
     The project has developed a specific video indexing and subtitling tool used by archive providers. The translation in three languages is done before delivery to publishers.
  2.  Publishers receive the catalog from the archive providers, and they search it for the sequences relevant to their project. They create a textual XML-based electronic book and add links to the chosen video sequences.
     The project has developed a set of tools for publishers , which are based on commercial SGML tools (Adept from ArborText and Dual Prism from INSO). The text is translated before delivery to the distribution site.
  3.  The produced electronic handbook and the attached catalog are installed through Internet on a distant platform by the publishers, where they are dynamically transformed to rich HTML for distribution on-line.
     The pages are dynamically produced in HTML 4 and JavaScripts generated inside the pages when the Web server makes a request, following a query by the user. This application is developed using Dual Prism. The Windows Media Player is called to display video sequences inside the HTML pages through the identifier of the document, the start time of the sequence and its duration.
     The use of an AV fragment catalog to access the video sequences may seem slightly heavy, but the indirection allows for the independence between the indexing and the actual video format: for instance, the project makes use of high-quality video locally, and low-quality video on the web.
     AV documents are delivered on local sites (school or university networks) through various channels: DVD and satellite broadcast are currently being set up in the project.
     The end-user platform developed by another partner is a PC on the local network and on the Web, allowing both access to the Web for search and retrieval of the textual data, and access to the local server for display of the relevant AV sequences.
     The project is in the process of developing a specific end-user environment, which includes an enhanced browser and a local server application.
 

Positive points

 
  •  The architecture is scalable for mass market deployment.
  •  The AV content is referred to through the AV Catalog, but not manipulated. This cuts down production costs dramatically, as text is much easier to work with than video.
  •  The translation of subtitles is made way cheaper, because once the synchronization with video is done for the first language, then the other languages which follow the same XML structure don't need to be synchronized.
 

Difficulties

 
  •  In the XML-to-HTML conversion phase, the resolution of the various types of links between various types of documents and various media, and their encapsulation within scripts was rather difficult to implement.
  •  Since the various partners had difficulties agreeing on a common vocabulary at the beginning of the project, it was accepted that the thesaurus could evolve separately and would converge at the end of the project. To make that concept feasible, it was agreed not to use actual keywords, but reference the keywords in the master thesaurus, so that when the common thesaurus would be finalized, the keywords in the sequence description would be accurate. Implementing this thesaurus indirection with its multilingual variance was also an interesting problem.
 

Reason for Technology Choices

 

Integration

 

The problem

 
The overall complexity to master is important, and difficult to deal with in terms of resources and delays. Integration of the different phases is a key issue in such a project. The implementation of each phase is achieved by different players and the integration of the various application within the same phase is already a sore point.
 

The choice

 
Integration is usually achieved in two ways:
  •  sharing APIs
  •  defining exchange formats
 

The solution

 
The solution chosen for Eurodelphes to ease the problem of integration was to make XML the pivot standardized format for all the documents exchanged between phases of the project. The reasons for this choice were that XML provides an unambiguous and deterministic way to define an exchange format for data, and allows to programmatically check that the exchanged data are conformant to the agreed model.
 
Three document types (SGML DTDs) were specified to produce and control data:
  •  Catalog and Description of AV resources model. This model is the basis for one of the most serious MPEG 7 proposals. [More will be told during the presentation on this subject as the standard is currently being finalized].
  •  Keyword list exchange model. This document type is for the thesaurus, under which control the archive providers document the videos, and the publishers use the catalog and create the handbook.
  •  Hypermedia Handbook model: This document type is designed for the multilingual authoring environment, including the definition of cross-documents links.
 
Instances are produced in SGML using the above DTDs and exchanged in XML without DTD.
 

Positive points

 
  •  Avoiding the definition and sharing of API makes parallel development easier.
  •  A small set of tools can process all documents, providing some customization
  •  The Unicode feature of XML is useful for multilingual applications
 

Difficulties

 
  •  Independence between textual information and video stream comes at the cost of a necessary resolution phase
 

Selecting video sequences

 

The problem

 
Searching for a particular video sequence in a mass of video documents is a tedious job. With a simple video player, searching a short sequence in 10h of video may take up to 10h.
 
Moreover, in this pedagogical project, the authors want to express their selection criteria without having to view the document and let the tool do the extraction of the sequences meeting their criteria for them.
 

The choice

 
There are two ways to solve this problem of finding an information in a video stream.
 
The first way is to extract the information from the video stream itself. This may be done in theory through the use of automatic recognition tools analyzing the video and sound tracks. There is work in progress in this field. However they are still in research state: the recognition tools generally require homogeneous document sets to work with, as well as good quality tracks. It is possible to automatically recognize video shots, but meaningful sequences generally include several shots. So, defining sequences would require to aggregate several shots.
 
The second way is to have a human operator write down the information separately. This task involves to "manually" define sequences, transcript the speech track in text, and document the content of the images for each sequence.
 
The necessary tool must be able to play video, to allow the interactive definition of sequences (putting "time tabs") and the writing of the text associated to sequences.
 
Downstream, the users must have another tool capable of reading the information and of associating it with the actual video sequence.
 

The solution

 
The second option was selected because it is robust and because it is based on text, for which technologies needed are mature.
 
A first application, the indexer's station allows the operator to produce the AV catalog, and a second application, the preview browser, allows the publisher's authors to find sequences through textual criteria and preview them. The design of the preview browser is detailed further down.
 
 

Positive points

 
  •  Searching video sequences through textual criteria is a more professional approach. The task can be achieved by the authors themselves instead of iconographs, and the process is much shorter, cheaper, and less painful.
  •  The interactive catalog browser is used to define very economically the video sequences as link targets.
 

Defining links

 

The problem

 
Finding a useful video sequence is not enough. It must be referred to accurately, and the reference must be resolved when consulting the final multimedia product.
 

The choice

 
One way to point to a video sequence is to insert in the link definition the information the Eurodelphes project inserted in the catalog, i.e. the link element would bear the video file URI, the time and duration of the sequence.
 
Another way is to refer to the catalog sequence definition. When the user requests a sequence, his query first accesses the sequence definition in the catalog, which, thanks to the file and time identifiers, accesses the actual video file. This indirection is resolved during the dynamic construction of the HTML page.
 

The solution

 
The second option was chosen, because only the ID of the sequence definition needs to be inserted in the book structure, all further information about the sequence is kept in the catalog. For instance, this feature enables the chief editor to define at a later stage whether the video should be subtitled or not.
 
This choice has a fundamental impact on the work process as the video resource is never separated from the text that documents it. In such a system a meaningful sequence cannot be a "raw" video file.
 
 

Positive points

 
  •  This architecture reflects the separation between commercial actors. The archive providers' added value is in the catalog. They may see fit to build new versions of it, for example to reorganize the actual video files (in Eurodelphes the average file is 3mn long). This can be done without updating the book that is authored by publishers.
 

Difficulties

 
  •  A consequence is of course that the catalog must be transmitted along with the handbook until the links are resolved at viewing time, which adds some complexity at the last stage of the production line.
 

Streamlining multimedia production

 

The problem

 
In "traditional" multimedia publishing production lines, the authoring of the content and the development of the user interface are intertwined. This is reflected - and partly caused - by the most popular multimedia authoring tools.
 
If the authoring of multimedia content is separated from the development of the user interface, there three positive points:
  •  All classical publishing QA processes (like collaborative authoring and content review)can be applied.
  •  The interface design and implementation and the content authoring can be simultaneously, thus reducing the production cycle.
  •  The design may be changed without re-writing the content.
 
Still, there comes a time when the content and the interface must be brought together.
 

The choice

 
One possible solution is to revert to using the native multimedia authoring tools, but then the work of entering the content is done twice. This solution is applied today by one of the partner companies using Director - a widespread tool which offers no import format.
 
The other solution is to rely on a powerful transformation engine, to automatically generate a display using the textual information and presentation templates and style-sheets, which layout the text and resolve the links to display the graphics, the images and the video sequences in the right location on the screen at the right time.
 

The solution

 
This last solution was chosen because it is time and cost-effective, and also because it allows changing the presentation without touching the content. This is the area of work of the multimedia designers.
 
Separating the content and the display specifications is an old story in the SGML/XML world. But in the multimedia industry, it is all new. Of course the tools and processes involved had to be developed, tested and refined.
 

Demo : Focusing on the Publishers Module

 
To show some results of this project, we will demonstrate the publisher's module, because it is central to the production line.
 
The role we will play is that of the chief editor, whose principal task is to integrate several sources from authors, graphic artists and archive providers. We will use the following tools:
  •  The Preview Browser, a search and preview engine that allows the selection of wanted video resources. This engine takes as input the XML catalog produced upstream by the archive providers.
  •  The XML editor which allows to write the handbook and insert links to the multimedia resources defined in the catalog.
  •  The Validation Browser, which allows to play a multimedia session, the interactive equivalent to the proof-reading phase of the traditional publishing process.


Screenshots:the Validation Browser

 

XML and e-Commerce   Table of contents   Indexes   XML as Infrastructure in Internet Relationship Management