W3C's Resource Description Framework Schemas: DTDs for the 21st Century   Table of contents   Indexes   Document Structure Identification: a New Paradigm

 
 

Achieving Individualized, Timely Web Delivery


 
Eric   Skinner
  Product Marketing Manager
  OmniMark Technologies
1400 Blair Place
Ottawa   Ontario  K1J 9B8  Canada
Phone: +1 613 745 4242
Fax: +1 613 745 5560
Email: eskinner@omnimark.com Web: http://www.omnimark.com
 
Biographical notice:
 
Eric Skinner
 
Eric Skinner is Product Marketing Manager at OmniMark Technologies Corp., makers of leading development software for content management & delivery applications. Eric has been with OmniMark since 1987, initially in application development and education, then on to sales & marketing. In his years at OmniMark, Eric has gained considerable experience in the successful commercial deployment of content management & delivery systems. Before joining OmniMark, Eric designed network database systems for Control Data Corporation and with his own consulting firm.
 
ABSTRACT:
 
In the age of the Web, information producers are facing a variety of challenges:
  • Timeliness: Information product lifecycles are shortening; instead of yearly or 3-month lifecycles for CD-ROMs, new information needs to be on the web within days or hours or even seconds.
  • Individualization: Customers are demanding information delivery that is relevant to their individual needs; instead of shipping the same technical manual to thousands of customers, manufacturers want to create dynamic job aids that deliver just the right information, just in time.
  • Integration: MIS, publications groups, marketing, and others must increasingly cooperate to deliver integrated content, where each individualized presentation is an assembly of information components owned by different areas of the publishing company.
  • Content Management: Information components are being re-used and re-assembled to meet the above challenges. Systems need to manage these information components and flow them from authoring through to delivery.
  • Delivery Format Proliferation: The explosion of delivery technologies (from flavors of HTML through client-side XML and "push" channels) means that publishers are faced with choosing to support the lowest common denominator, or building systems that can support multiple delivery formats simultaneously and adapt to new formats as they emerge.
 
This presentation will explore the above challenges, then discuss business and project management approaches to achieving individualized, timely web delivery.
 
Two live production systems will be used for illustration:
 
  • The Wall Street Journal Interactive, the web's most successful online newspaper, delivering individualized content live, and built on an XML/SGML and RDBMS architecture.
  • A technical documentation application, illustrating the improved navigability and user task support possible with dynamic web delivery.
 
The following architecture principles will be discussed from a project management and business benefit perspective:
  • Integration of MIS and Publications tools & techniques: In order to deliver integrated information products, systems must leverage the strengths of MIS tools (relational databases, forms-based editing) and publishing systems (formatting engines, structured authoring systems, etc.)
  • Microdocument Architecture(tm): OmniMark Technologies' Microdocument Architecture (MDA(tm)) is an extension of the relational database model that permits management of relationships occurring within descriptive information.
  • Media neutrality: MDA allows you to model information independent of the structures of any particular media, providing complete flexibility to support multiple simultaneous output formats now and in the future.
  • Link management: To meet the needs of individual users with diverse requirements, hypertext links need to be generated dynamically, and behave differently depending on the nature of requests.
 
 

Outline

 
  • Evolution in publishing
  • MIS/Document Integration
  • Present Challenges
  • Meeting the Challenges
  • Case Studies
  • System Components
  • Three roles for XML
  • It's all about Future-Proofing
 
 

Evolution in Publishing

 
  • Books --> CDROM --> Web
  • Faster delivery cycles
  • More intense competition
  • Higher number of possible simultaneous delivery formats
  • Individualization becomes possible
  • Higher user expectations
  • More potential for creating better products
 
 

MIS/Document Integration

 
  • Previously distinct, now integrating:
    • Corporate database management (RDBMS)
    • Document management
  • Driver: Web Delivery
  • Web blurs the lines between data and documents
  • Users want quick access to synthesis of various data sources
 
 

The New Challenges

 
  • Timeliness
  • Individualization
  • Delivery Format Proliferation
  • You or your competitors are setting the expectations now
 
 

Challenge 1: Timeliness

 
  • News breaks at 8:00am, story online at 8:01am
  • New safety information? Data in the field now, not next month
  • New data available now
    • as soon as available
    • as soon as approved
 
 

Challenge 2: Individualization

 
  • Topics that match my profile and skill level
  • Synthesis of:
    • relevant generic topics
    • live data sources (e.g. stock quotes)
    • data about me (e.g. what stocks I own)
  • Dynamic navigation
    • Links that work the way I want them to
 
 

Challenge 3: Delivery Format Proliferation

 
  • Delivery media is constantly evolving
    • Browser wars (Microsoft, Netscape)
    • Browser evolution (IE3, IE4, IE5...)
    • DHTML, Layers, Frames, XML/XSL, text-only
    • PDF, WinHelp
    • CD-ROM
    • Don't forget Print
  • Which to use? Whatever each customer wants.
  • All of them, now and tomorrow
 
 

Meeting the Challenges

 
  • Rethinking the "document"
  • Focus on components
  • Component Synthesis
  • Media-Independence
 
 

Rethinking the Document

 
  • Traditional "book" view of documents:
    • Table of contents often arbitrary
    • Doesn't reflect individual needs
    • Hard to update on granular, real-time basis
    • Vol, Chap, Sec meaningless online
    • People don't want "documents" anymore
 
 

Focus on Components

 
  • "Document" is becoming obsolete
  • So... Don't store documents
    • Don't store pre-packaged information products: database designers know this
  • Microdocument Architecture(tm) (MDA)
  • Store "microdocuments": reusable components
    • XML or SGML encoded
  • Use a database to organize the microdocuments and other data
  • Use database queries to extract relevant data
 
 

Component Synthesis

 
  • Synthesize delivery object (web page, printed book, CD-ROM, pager message)
  • Assemble the relevant component
    • XML/SGML microdocuments
    • Relational data
    • Files from file system
  • Determine appropriate links
    • existence, behavior
    • ensure link accuracy
 
 

Media Independence

 
  • Store media-neutral components
    • RDBMS information
    • XML/SGML microdocuments, format-neutral
  • Translate to required delivery format
    • dynamically
    • batch
  • Support multiple formats simultaneously
  • Insulation from media evolution
  • Launching pad for innovation
 
 

Case: Wall Street Journal Interactive Edition

 
  • Most successful online newspaper
    • subscriber & ad revenue
  • Up to date
  • Profile-driven
  • Relevant
  • Worth paying for
 
 

Case: Aircraft Documentation

 
  • Every aircraft is unique
  • Planes bought in batches, one print documentation set for each batch
  • Pilots flip pages looking for relevant information
  • Online prototype:
    • Extract topics (live) for particular aircraft being flown now
    • Identify differences
 
 

System Components

 
  • The repository
  • Search tools
  • Authoring Tools
  • Middleware
  • Delivery Technology
 
 

MDA Repository

 
  • RDBMS Schema
  • XML (or SGML) to describe topic-sized micro-documents
  • Microdocuments organized by RDBMS
  • Meta-data and multidimensional data captured by RDBMS schema
  • Mainstream tools: Oracle, Sybase, MS SQL Server, ...
  • Rapid query processing
  • Object/Relational databases?
 
 

Sample MDA: Newspaper

 
 
 

Sample MDA: Newspaper

 
 
 

Searching

 
  • RDBMS indices
  • Text index of microdocuments
  • Key elements from microdocuments extracted to RDBMS fields and indexed there
  • Rarely a need for SGML/XML element-level index of all elements in microdocument
    • rare to index every field in RDBMS
  • Mainstream tools: Fulcrum, Verity, ...
 
 

Authoring

 
  • Forms software for relational fields, workflow
    • PowerBuilder, Visual Basic, Web Forms + Java, ...
  • XML/SGML authoring software for average or complex microdocuments
    • Adobe Frame+SGML, ArborText, more on the way...
  • Mainstream word processors for very simple microdocuments
    • MS Word
    • Caveat!
 
 

Middleware

 
  • Middleware performs:
    • synthesis of RDBMS and XML objects into delivery object (web page, print formatting codes, push feed)
    • workflow automation, system glue
    • MDA loading
  • Tools: XML-savvy development suites
 
 

Delivery Technology

 
  • Anything the user wants
  • Web:
    • IE, Netscape, DHTML, Frames, ...
    • XML/XSL
  • Post-synthesis, different from source XML
  • Print:
    • PDF, Word, Frame, Interleaf, Xyvision, Penta, ...
  • CD-ROM
  • Push media
 
 

Three Roles for XML

 
  • Internal corporate data modeling
    • in conjunction with RDBMS
  • Interchange
    • e.g. OFX, ICE, etc.
  • Delivery
    • along with XSL
    • offload formatting to browser after synthesis
  • One company can do all three, with 3 separate schemas
 
 

How will XML be implemented?

 
 
 
"XML will be used in a couple of different ways. One is for data interchange between humans and machines, such as from a Web server to a user's browser. The other is for data exchange between applications, or from machine to machine.
 
"In either case, you'll likely require a three-tiered architecture: a database backend; a middle-tier server, where the business logic acts on the data; and the client, where the data is displayed and processed further. The database can receive information, perhaps already XML-formatted, from multiple data sources. The middle tier can then pull together the data and publish it to the final-presentation tier."
 
Trisha Gorman, CNET, 10-Mar-98, http://www.cnet.com/Content/Builder/Authoring/Xml20/
 
 

It's About Future-Proofing

 
  • User expectations are evolving fast
  • Meeting challenges means:
    • component-driven
    • neutral repository
  • Success means:
    • higher revenue, customer satisfaction, customer retention
  • Thrive on new opportunities with Microdocuments, XML, and RDBMS

W3C's Resource Description Framework Schemas: DTDs for the 21st Century   Table of contents   Indexes   Document Structure Identification: a New Paradigm