A canonical query language &, its efficient implementation   Table of contents   Indexes   Indexsheets - the "Extensible Indexing Language" (XIL)

Abstract Datatype
Multimedia Document Structured Query
Spatial and Temporal Datatype
 XML Schema 
 

Spatial/temporal datatypes

 an approach to specifying and querying multimedia objects and scheduled structures in XML documents
Liu, Peiya
 
 Peiya  Liu
 Senior Member of Technical Staff
  New Jersey 
 Princeton 
 Siemens Corporate Research, Inc. 
 USA 
Siemens Corporate Research, Inc.,  755 College Road East
Princeton  New Jersey  08540 USA
Phone: +1 609 7343349 Fax: +1 609 7346565 email: pliu@scr.siemens.com web site: www.scr.siemens.com
 Biography
 Peiya Liu is a project manager and senior member of technical staff at Multimedia Documentation Program, Siemens Corporate Research, Inc. Peiya received his PhD in computer science from University of Texas at Austin in 1986. He has severed as a program committee member and spoken in many industrial and academic conferences. He is also a technical program co-chair for 2000 IEEE International Conference on Multimedia and Expo. He is currently the standards section editor of IEEE Multimedia magazine and on the editorial board of Kluwer Journal of Multimedia Tools and Applications.
Hsu, Liang H.
 
 Liang H.  Hsu
 Manager and Distinguished Member of Technical Staff
  New Jersey 
 Princeton 
 Siemens Corporate Research, Inc. 
 USA 
Siemens Corporate Research, Inc.,  755 College Road East
Princeton  New Jersey  08540 USA
Phone: +1 609 7346521 Fax: +1 609 7346565 email: lh@scr.siemens.com
 Biography
 Liang H. Hsu is a distinguished member of technical staff and head of Multimedia Documentation Program, Siemens Corporate Research, Inc. Prior to joining Siemens, he worked for Digital Equipment Corporation as a field service support. He has many years industrial experiences in SGML/XML technology and business talks. His current interests include conversion of legacy documents into SGML, SGML-based document composition and hyperlinking for complex products, multimedia document delivery mechanisms, and browsing and navigation support for service-related applications.
 Abstract
 Many useful XML applications require a smooth integration of time- and space- dependent media objects and structures in XML documents. The XML tree document structures have limitations in support of spatial and temporal relationships for multimedia objects querying. However, the relationships could be specified based on spatial and temporal datatypes. The XML Schema:Datatypes framework opens up an opportunity to explore this dimension in a new way, and this paper will show this new perspective in specifying and querying multimedia objects and structures in such a framework. This abstract datatype approach provides potential advantages in query processing of multimedia objects and structures in XML.
 

Introduction

 Many useful XML applications require multimedia objects and structures in documents. These multimedia document applications are across industries such as electronic product manuals in heavy industries, web TV programs in entertainment industries, e-commerce web documents, geographic information systems, etc. In a multimedia document, the content may include both static/spatial media (such as text, graphics, drawings, images, etc.) and time-based media (such as video, audio, animation, etc.). The media components and content can be further organized into three major document structures: hierarchical, hyperlinked, and scheduled (including both temporal and spatial). The scheduled structure plays an important role in organizing and accessing space- and time- dependent media objects.
 Proposed document query languages are focusing on hierarchical and hyperlinked structures which are mainly used for organizing textual information. The multimedia documents usually contain non-textual media objects in spatial and temporal relationships, which are non-hierarchical. The relationships cannot effectively be modeled in pure hierarchical or hyperlinked structures to support multimedia object retrieval. XML document tree structures are mainly used to model parent/child and sibling relationships of document elements. They can effectively support hierarchical and sequence queries, but are not appropriate for spatial/temporal queries of multimedia objects in XML documents. Integration of multimedia objects into document models for query requires a scheme to specify spatial/temporal structures and constraints. The scheme would impact efficiency and effectiveness of spatial and temporal information retrieval and processing. Currently, there is no standard way to specify these spatial and temporal media objects and structures in XML.
 ADT, Abstract Data Type  
 
In this paper, we propose a spatial/temporal datatype scheme based onADT to specify scheduled structures and to query multimedia objects in XML documents. Examples of spatial datatypes are points, polylines, areas, etc. Examples of temporal datatypes are instants, intervals, periods, etc. Spatio-temporal data types can also be defined by combining both spatial and temporal datatypes into composite ones such as "changing area over a period of time". These spatial/temporal datatypes are used to structure (or schedule) multimedia objects in documents. Based on abstract data types, many spatial and temporal operators, such as inside, nearby, before, after, etc., can be defined for querying multimedia objects in scheduled structures with efficient indexing support.
 This ADT approach has several advantages. It can be formalized within the XML Schema Part 2: Datatypes framework. It provides extensibility of composing basic spatial/temporal datatypes or operators into composite ones. The spatial and temporal relationships of multimedia objects in documents can be specified by using spatial and temporal datatypes and their operators in queries. Furthermore, spatial and temporal query operators can also be efficiently designed since indexing techniques are often based on datatypes for optimizing query processing.
 

The spatial/temporal datatype approach

 The multimedia objects in XML can be specified as XML elements with spatial and temporal datatypes. These spatial and temporal element datatypes can be formalized within new W3C XML Schema development , particularly the datatype part . The spatial and temporal relationships are derived from element datatypes and their associated operations rather than from element hierachical relationships. Examples of datatype operations could be spatial distance operations, spatial direction operations, temporal order operations, etc. In this way, the multimedia object queries can be specified based on spatial and temporal relationships. A similar technique for specifying moving objects was proposed by in relational databases. In the following, we give an example of video document along with XML schema to illustrate a proposed document query language, MMDOC-QL, and the spatial and temporal element datatypes.
 

A structured video document

 A video document could consist of video segments corresponding to scenes in a video. Each scene consists of video objects appeared in a sequence of shots. In this industrial video, each scene highlights certain gas turbine locations described as video objects for showing maintenance and service operations. Shots indicate those video frames having significant motion changes in video objects. This video document can be automatically generated by our video AIU extractor based on advanced scene changing and video segmentation techniques . The video is shown in .
 
<xsd:schema xmlns:xsd="http://www.mymind.com/VideoDocSchema">
<xsd:element name="videodoc">
<xsd:complexType>
<xsd:element name="videoseg" minOccurs="1" maxOccurs="*">
<xsd:complexType>
<xsd:element name="videoAIU" minOccurs="1" maxOccurs="*">
<xsd:complexType>
<xsd:element name="shot" minOccurs="1" maxOccurs="*">
<xsd:complexType>
<xsd:element name="area" type="region"/>
<xsd:element name="frame" type="integer"/>
</xsd:complexType>
</xsd:element>
<xsd:attribute name="id" type="ID"/>
</xsd:complexType>
</xsd:element>
<?Pub Caret?>   </xsd:complexType>
</xsd:element>
</xsd:complexType>
</xsd:element>

<xsd:complexType name="point">
<xsd:element name="x" type="xsd:integer"/>
<xsd:element name="y" type="xsd:integer"/>
</xsd:complexType>
<xsd:complexType name="region">
<xsd:element name="loc" type="point" MinOccurs="3" MaxOccurs="*"/>
</xsd:complexType>

<xsd:element name="mousepos" type="point"/>
<xsd:element name="focusarea" type="region"/>
</xsd:schema>

<videodoc>
<videoseg>
<videoAIU id="object01">
<shot>
<area>
<loc><x>254</x><y>161</y></loc>
<loc><x>254</x><y>270</y></loc>
<loc><x>370</x><y>270</y></loc>
<loc><x>370</x><y>161</y></loc>
</area>
<frame>1</>
</shot>
<shot>
<area> ...</>
<frame>66</>
</>
...
</videoAIU>
<videoAIU id="object02">... </>
...
<videoseg>
<videoAIU id= ...> ...  </>
<videoAIU id= ...> ...  </>
...
</>
<vidiodoc>
 
An industrial video
 (Left: keyframes in video segments found by AIU extractor along with regions of interests shown in red lines. Right: video object locations in each video segment with a sequence of shots exhibiting video objects motion changes)
 

A brief introduction to MMDOC-QL

 MMDOC-QL is our proposed multimedia document query language for structured information retrieval. An example of the query is in the form of "find all video object ids where the objects are shown up at the mouse click position (x0, y0) in a shot".
 
GENERATE:<List>%objectnum <List>
FROM:    video.xml
PATTERN: {"object"[0-9][0-9]*/%objectnum};
{<mousepos><x>x0</><y>y0</></mousepos>/%mpos};
CONTEXT: {(<videoAIU> with id=%objectnum) containing
{<area>/&reg}
and POINT-INSIDE(&reg %mpos)};
 In MMDOC-QL, there are four clauses: GENERATE clause is used to describe the final results of documents. FROM clause is used to describe source documents to query. PATTERN clause is used to describe the domains of logical variables in the form of regular expressions or of document elements. CONTEXT clause is used to describe document element constraints infirst-order logical expressions . A logical expression consists of primitivedocument path expressions andelement datatype expressions including spatial and temporal datatype operators.
  PATTERN clause is used to describe the domains of logical variables. There are two kinds of logical variables: string and element. By default, the domain of a string variable is the set of allowable strings in tag names, tag attributes or tag values in FROM clause. The domain of an element variable is the set of allowable elements in FROM clause. The domains are boundaries for logical variables to find values satisfying the first-order logical expressions in CONTEXT clause. The free variables are indicated by "%". A free variable means "for all" quantifier in CONTEXT clause. In the above example, "%objectnum" and "%mpos" are free variables. A variable indicated by "&" denotes a bound variable for "there exists" quantifier in CONTEXT clause. In the above example, element variable "&reg" is used to denote an existence of one element <area> satisfying the document path expression of "(<videoAIU> with id=%objectnum) containing <area> ".
 A document path expression is a logical statement for specifying document element constraints. The constraints are specified by using element path relationships: parent/child relationship, sibling relationship, and tag attribute. The parent/child relationship constraints are described by keywords: insidedirectly insidecontainingdirectly containing , etc. The sibling relationship constraints are described by keywords: beforeimmediately beforeafterimmediately aftersiblingimmediately sibling , etc. The element attribute constraints are described by keyword with . In the above example, document path expression "(<videoAIU> with id=%objectnum) containing {<area>/&reg}" specifies constraints on element "<videoAIU>" by using tag attribute id and its child relationship to element "<area>". " &reg" is a variable to denote an existence of one element "<area>" which satisfies the path expression.
 The element datatype expressions are used to describe arithmetic expressions about datatype operations including spatial/temporal datatypes operations. Note that aggregation functions can be viewed as a special case of these datatype operations since they operate on input data of real number datatype. The spatial and temporal datatype operations are stereotypical functional computations of temporal and spatial relationships such as SIZE, DISTANCE, DIRECTION, COVER, or TIME-BEFORE, etc. In the above example, POINT-INSIDE(element1:point element2: region) is a spatial operation and returns a value with boolean datatype. It returns "true" if a point is inside a region. Otherwise, it returns "false". The details of spatial and temporal datatype operations are addressed in the next section.
 

Specifying multimedia objects as spatial and temporal datatypes

 In general, there are three kinds of spatial and temporal datatypes to model multimedia objects: spatial, temporal and spatio-temporal. All these datatypes can be formalized as XML element datatypes. The stereotypical spatial and temporal operators can be defined for specifying scheduled relationships of multimedia objects. We believe that this ADT scheme is general enough to specify multidimensional coordinate spaces such as FCS and event schedules in HyTime documents for multimedia objects query and processing. For ease of operator composition, all defined spatial and temporal datatype operators are required to produce outputs in legal datatypes.
 
  • The primitive spatial datatypes are point, polyline and region. The composite spatial datatypes can be constructed from these primitives. Spatial datatypes are used to model geometric data such as cities as points, route as polylines, etc. Typical spatial operators are SIZE (element: region), DISTANCE(element1:point element2:point), POINT-INSIDE(element1:region element2:point), EAST-DIRECTION(element1: region element2: region), OVERLAP(element1: region, element2: region), etc
  •  
  • The temporal primitive datatypes are instant, interval and period. The temporal datatypes are used to specify multimedia objects in dynamic media such as audio, animation, video, etc. Typical 13 temporal operators can be defined based on temporal XML datatypes. Some examples are MEET(element1: interval element2: interval), TIME-BEFORE(element1: interval element2: interval), etc.
  •  
  • The spatio-temporal datatypes are used to model spatial data changing over a period of time. Examples are geometric objects moving in video clips, weather moving path, etc. All spatio-temporal datatypes are actually composite datatypes. For instance, moving points are constructed from "point" spatial datatype and "instant" temporal datatype. Moving regions are constructed from "region" spatial datatype and "instant" temporal datatype. Examples of spatio-temporal operators are DURATION(element: moving-point), MAX-COVER-AREA(element: moving-region), and so on.
  •  All spatial and temporal datatypes can be formalized in XML:Schema framework. Some of examples are shown as follows.
     
    <xsd:complexType name="polyline">
    <xsd:element name="loc" type="point" MinOccurs="3" MaxOccurs="*"/>
    </xsd:complexType>
    <xsd:simpleType name="instant" base="xsd:time">
    </xsd:simpleType>
    <xsd:complexType name="interval">
    <xsd:element name="start" type="instant"/>
    <xsd:element name="end" type="instant"/>
    </xsd:complexType>
    <xsd:complexType name="moving-point">
    <xsd:element name="loc" type="point"/>
    <xsd:element name="timestamp" type="instant"/>
    </xsd:complexType>
    
     

    Querying multimedia objects from spatial and temporal relationships

     Complex multimedia objects queries can be formed by using spatial and temporal operators to specify the scheduled constraints. In the following example, we specify a query "find all video objects and their frame numbers, which are shown up in a focus area of the video display window".
     
    GENERATE:<List>%objectnum %frame-number <List>
    FROM:    video.xml
    PATTERN: {"object"[0-9][0-9]*/%objectnum};
    {[0-9][0-9]*/%frame-number};
    {<focusarea> ...</focusarea>/%focus};
    CONTEXT: {(<videoAIU> with id=%objectnum) containing
    (<frame> containing "%frame-number")
    sibling {<area> /%reg}
    and  OVERLAP(%reg %focus)};
    
     

    Related work

     Two types of related work are described here. One is related to query languages. Based on underlying models and media types, query languages can be generally classified as free form query, relational structured query and document structured query as shown in
     
     The spectrum of information retrieval
    Media Types /Data Modeling Free Forms Relational Tables Structured Documents
    Textual Free-Text Retrieval Relational Structured Query (SQL) Document Structured Query MMDOC-QL)
    Non-Textual Content-Based Query Multimedia Database Query (SQL/MM SQL/Temporal) Multimedia Document Query (MMDOC-QL)
     MMDOC-QL distinguishes itself from other work in dealing with multimedia objects queries based on spatial and temporal relationships in structured documents. In the following, we describe related standardization work on spatial and temporal specifications and queries.
     FCS, Finite Coordinate Space 
     
    ISO HyTime based on SGML usesFCS to define scheduled structures and events. These event schedules are intentionally designed for HyTime document presentation. FCS defines an abstract and system-independent method of specifying spatial and temporal information separated from content to be presented as event schedules in a multidimensional coordinate space. The design motivation is based on presentation abstraction rather than information retrieval. The indexing scheme support in HyTime is limited in querying spatial/temporal media objects and structures.
     W3C SMIL is based on XML to define spatial and temporal layouts for SMIL document playout. The layout information is related to media display windows on a screen and media playing time. Thus, the spatial and temporal structures provided in SMIL are also for presentation purpose rather than for storage representation to be accessed. Futhermore, there are structural differences in representation . Often, the presentation forms are not sufficient for storage representation. Spatial and temporal query processing is often less emphasized in presentation-oriented multimedia specifications.
     SQL/MM and SQL3/Temporal are new ISO standardization projects for extending database query language capability to specify and manage multimedia objects and temporal information in the relational data model. Both are focusing on integration of time- or space- dependent multimedia objects into relational data models for query. However, multimedia document models impose requirements on querying, which are quite different from this relational table model since not only document content but also document structures must be available for retrieval. These proposed query specifications based on relational data models would limit the retrieval capability for document models.
     

    Conclusion remarks

     Proposed XML document query languages have some limitations in querying multimedia document objects which are in temporal or spatial relationships. Most of these languages focus on textual and hierarchical structures and have limitations in supporting temporal and spatial relationships of multimedia objects, which are not in hierarchical relationships. In this paper, we tackle these limitations by specifying the multimedia objects as spatial and temporal datatypes in XML. The multimedia object relationships can then be specified by using spatial and temporal datatypes and their operators for XML document retrieval.
     The main contributions of this paper are (1) to provide a method to specify spatial and temporal datatypes in XML for querying multimedia objects. We illustrate many flavors of spatial and temporal datatypes defined as element datatypes, which can be formalized within the XML Schema Part 2: Datatypes framework. Therefore, multimedia objects and scheduled structures can be smoothly integrated into XML document models for retrieval. (2) to design a multimedia document query language, MMDOC-QL, along with stereotypical spatial and temporal operators to retrieve multimedia objects in XML documents. Many spatial/temporal indexing methods are currently available for supporting and optimizing these spatial/temporal datatypes query processing.
     Bibliography
     
    SDQL 96 ISO 10179:1996 Information Technology -Processing Languages - Document Style Semantics and Specification Language (DSSSL)
     
    HyTime 97 ISO/IEC 10744:1997 Hypermedia/Time-based Structuring Language (HyTime), Second Edition.
     
    Rutledge 98 L. Rutledge, L. Hardman, J. van Ossenbruggen and D. C. A. Bulterman, “Structural Distinctions Between Hypermedia Storage and Presentation,” in Proc. ACM Multimedia 98, September 1998, pp.145-150.
     
    Allen 83 J. F. Allen Maintaining Knowledge about Temporal Intervals. Comm. ACM 26(11), 1983.
     
    Vazirgiannis 98 M. Vazirgiannis, Y Theodoris and T Sells, Sptio-Temporal Composition and Indexing for Large Multimedia Applications. ACM Multimedia Systems, 6(4), 1998, pp 284-298.
     
    Erwig 99 M. Erwig, R. H. Guting, M. Schneider and M. Vazirgiannis, Spatio-Temporal DataTypes: Approach to Modeling and Querying Moving Objects in Databases, GeoInformatica Vol 3, No 3, 1999.
     
    Manolopoulos 00 Y. Manolopoulos, Y Theodoridis and V. J. Tsotras, Advanced Database Indexing, Kluwer Academic Publishers, 2000
     
    SMIL 98 Synchronized Multimedia Integration Language (SMIL) 1.0 Specification, W3C Recommendations 15–June–1998
     
    XML 98 Extensible Markup Language (XML) 1.0, W3C Recommendations 10–Feburary–1998,
     
    XML Schema Part 1: Structures XML Schema Part 1: Structures, W3C Working Draft 25 February 2000,:
     
    XML Schema Part 2: Datatypes XML Schema Part 2: Datatypes: W3C Working Draft 25 February 2000,
     
    SQL Standardization Projects  http://www.jcc.com/SQLPages/jccs_sql.htm (SQL Standard Reference Page)
     
    Chakraborty 99 A. Chakraborty, P. Liu and L. Hsu, Authoring and Videwing Video Documents using SGML structure, 1999 IEEE International Conference on Multimedia Computing and Systems, pp-654-660 Florence, Italy,
     
    Liu 99 P. Liu, Y. F. Day, L. H. Hsu, Automatic Generation of DSSSL Specifications for Transforming SGML Documents into Card-Based Presentations, GCA Markup Technologies 99, PA, USA,
     
    XML-QL 99 A Deutsch, M. Fermandez, D. Florescu, A. Levy and D. Suciu: A Query Lanuage For XML, WWW'99
     
    YATL 98 Your Mediators Need Data Conversion, ACM-SIGMOD 1998
     
    Lorel 00 S. Abiteboul, P.Buneman, and D. Suciu, Data on the Web, Published by Morgan Kaufsmann, 2000
     
    XQL 98 J. Robie abd J. Lapp, XML Query Language, QL'98, http://www.w3c.org/TandS/QL/QL98/
     
    Del Bimbo 99 A. Del Bimbo, Visual Information Retrieval, Published by Morgan Kaufsmann, 1999

    A canonical query language &, its efficient implementation   Table of contents   Indexes   Indexsheets - the "Extensible Indexing Language" (XIL)