![]() |
A canonical query language &, its efficient implementation | Table of contents | Indexes | Indexsheets - the "Extensible Indexing Language" (XIL) | ![]() |
|||
| Abstract Datatype Multimedia Document Structured Query Spatial and Temporal Datatype XML Schema ![]() | Spatial/temporal datatypes |
| an approach to specifying and querying multimedia objects and scheduled structures in XML documents |
| Liu, Peiya |
| Peiya Liu |
| Senior Member of Technical Staff |
New Jersey ![]() Princeton ![]() Siemens Corporate Research, Inc. ![]() USA ![]() | Siemens Corporate Research, Inc.,
755 College Road East Princeton New Jersey 08540 USA Phone: +1 609 7343349 Fax: +1 609 7346565 email: pliu@scr.siemens.com web site: www.scr.siemens.com |
| Biography |
| Hsu, Liang H. |
| Liang H. Hsu |
| Manager and Distinguished Member of Technical Staff |
New Jersey ![]() Princeton ![]() Siemens Corporate Research, Inc. ![]() USA ![]() | Siemens Corporate Research, Inc.,
755 College Road East Princeton New Jersey 08540 USA Phone: +1 609 7346521 Fax: +1 609 7346565 email: lh@scr.siemens.com |
| Biography |
| Abstract |
Introduction |
ADT, Abstract Data Type ![]() | In this paper, we propose a spatial/temporal datatype scheme based onADT to specify scheduled structures and to query multimedia objects in XML documents. Examples of spatial datatypes are points, polylines, areas, etc. Examples of temporal datatypes are instants, intervals, periods, etc. Spatio-temporal data types can also be defined by combining both spatial and temporal datatypes into composite ones such as "changing area over a period of time". These spatial/temporal datatypes are used to structure (or schedule) multimedia objects in documents. Based on abstract data types, many spatial and temporal operators, such as inside, nearby, before, after, etc., can be defined for querying multimedia objects in scheduled structures with efficient indexing support. |
The spatial/temporal datatype approach |
A structured video document |
<xsd:schema xmlns:xsd="http://www.mymind.com/VideoDocSchema"> <xsd:element name="videodoc"> <xsd:complexType> <xsd:element name="videoseg" minOccurs="1" maxOccurs="*"> <xsd:complexType> <xsd:element name="videoAIU" minOccurs="1" maxOccurs="*"> <xsd:complexType> <xsd:element name="shot" minOccurs="1" maxOccurs="*"> <xsd:complexType> <xsd:element name="area" type="region"/> <xsd:element name="frame" type="integer"/> </xsd:complexType> </xsd:element> <xsd:attribute name="id" type="ID"/> </xsd:complexType> </xsd:element> <?Pub Caret?> </xsd:complexType> </xsd:element> </xsd:complexType> </xsd:element> <xsd:complexType name="point"> <xsd:element name="x" type="xsd:integer"/> <xsd:element name="y" type="xsd:integer"/> </xsd:complexType> <xsd:complexType name="region"> <xsd:element name="loc" type="point" MinOccurs="3" MaxOccurs="*"/> </xsd:complexType> <xsd:element name="mousepos" type="point"/> <xsd:element name="focusarea" type="region"/> </xsd:schema> <videodoc> <videoseg> <videoAIU id="object01"> <shot> <area> <loc><x>254</x><y>161</y></loc> <loc><x>254</x><y>270</y></loc> <loc><x>370</x><y>270</y></loc> <loc><x>370</x><y>161</y></loc> </area> <frame>1</> </shot> <shot> <area> ...</> <frame>66</> </> ... </videoAIU> <videoAIU id="object02">... </> ... <videoseg> <videoAIU id= ...> ... </> <videoAIU id= ...> ... </> ... </> <vidiodoc> |
A brief introduction to MMDOC-QL |
| MMDOC-QL is our proposed multimedia document query language for structured information retrieval. An example of the query is in the form of "find all video object ids where the objects are shown up at the mouse click position (x0, y0) in a shot". |
GENERATE:<List>%objectnum <List>
FROM: video.xml
PATTERN: {"object"[0-9][0-9]*/%objectnum};
{<mousepos><x>x0</><y>y0</></mousepos>/%mpos};
CONTEXT: {(<videoAIU> with id=%objectnum) containing
{<area>/®}
and POINT-INSIDE(® %mpos)};
|
| In MMDOC-QL, there are four clauses: GENERATE clause is used to describe the final results of documents. FROM clause is used to describe source documents to query. PATTERN clause is used to describe the domains of logical variables in the form of regular expressions or of document elements. CONTEXT clause is used to describe document element constraints infirst-order logical expressions . A logical expression consists of primitivedocument path expressions andelement datatype expressions including spatial and temporal datatype operators. |
| PATTERN clause is used to describe the domains of logical variables. There are two kinds of logical variables: string and element. By default, the domain of a string variable is the set of allowable strings in tag names, tag attributes or tag values in FROM clause. The domain of an element variable is the set of allowable elements in FROM clause. The domains are boundaries for logical variables to find values satisfying the first-order logical expressions in CONTEXT clause. The free variables are indicated by "%". A free variable means "for all" quantifier in CONTEXT clause. In the above example, "%objectnum" and "%mpos" are free variables. A variable indicated by "&" denotes a bound variable for "there exists" quantifier in CONTEXT clause. In the above example, element variable "®" is used to denote an existence of one element <area> satisfying the document path expression of "(<videoAIU> with id=%objectnum) containing <area> ". |
| A document path expression is a logical statement for specifying document element constraints. The constraints are specified by using element path relationships: parent/child relationship, sibling relationship, and tag attribute. The parent/child relationship constraints are described by keywords: inside , directly inside , containing , directly containing , etc. The sibling relationship constraints are described by keywords: before , immediately before , after , immediately after , sibling , immediately sibling , etc. The element attribute constraints are described by keyword with . In the above example, document path expression "(<videoAIU> with id=%objectnum) containing {<area>/®}" specifies constraints on element "<videoAIU>" by using tag attribute id and its child relationship to element "<area>". " ®" is a variable to denote an existence of one element "<area>" which satisfies the path expression. |
| The element datatype expressions are used to describe arithmetic expressions about datatype operations including spatial/temporal datatypes operations. Note that aggregation functions can be viewed as a special case of these datatype operations since they operate on input data of real number datatype. The spatial and temporal datatype operations are stereotypical functional computations of temporal and spatial relationships such as SIZE, DISTANCE, DIRECTION, COVER, or TIME-BEFORE, etc. In the above example, POINT-INSIDE(element1:point element2: region) is a spatial operation and returns a value with boolean datatype. It returns "true" if a point is inside a region. Otherwise, it returns "false". The details of spatial and temporal datatype operations are addressed in the next section. |
Specifying multimedia objects as spatial and temporal datatypes |
| In general, there are three kinds of spatial and temporal datatypes to model multimedia objects: spatial, temporal and spatio-temporal. All these datatypes can be formalized as XML element datatypes. The stereotypical spatial and temporal operators can be defined for specifying scheduled relationships of multimedia objects. We believe that this ADT scheme is general enough to specify multidimensional coordinate spaces such as FCS and event schedules in HyTime documents for multimedia objects query and processing. For ease of operator composition, all defined spatial and temporal datatype operators are required to produce outputs in legal datatypes. |
| All spatial and temporal datatypes can be formalized in XML:Schema framework. Some of examples are shown as follows. |
<xsd:complexType name="polyline"> <xsd:element name="loc" type="point" MinOccurs="3" MaxOccurs="*"/> </xsd:complexType> <xsd:simpleType name="instant" base="xsd:time"> </xsd:simpleType> <xsd:complexType name="interval"> <xsd:element name="start" type="instant"/> <xsd:element name="end" type="instant"/> </xsd:complexType> <xsd:complexType name="moving-point"> <xsd:element name="loc" type="point"/> <xsd:element name="timestamp" type="instant"/> </xsd:complexType> |
Querying multimedia objects from spatial and temporal relationships |
| Complex multimedia objects queries can be formed by using spatial and temporal operators to specify the scheduled constraints. In the following example, we specify a query "find all video objects and their frame numbers, which are shown up in a focus area of the video display window". |
GENERATE:<List>%objectnum %frame-number <List>
FROM: video.xml
PATTERN: {"object"[0-9][0-9]*/%objectnum};
{[0-9][0-9]*/%frame-number};
{<focusarea> ...</focusarea>/%focus};
CONTEXT: {(<videoAIU> with id=%objectnum) containing
(<frame> containing "%frame-number")
sibling {<area> /%reg}
and OVERLAP(%reg %focus)};
|
Related work |
| Two types of related work are described here. One is related to query languages. Based on underlying models and media types, query languages can be generally classified as free form query, relational structured query and document structured query as shown in |
Media Types /Data Modeling |
Free Forms
|
Relational Tables
|
Structured Documents
|
Textual
|
Free-Text Retrieval |
Relational Structured Query (SQL) |
Document Structured Query
MMDOC-QL) |
Non-Textual
|
Content-Based Query
|
Multimedia Database Query (SQL/MM SQL/Temporal) |
Multimedia Document Query (MMDOC-QL) |
|
| MMDOC-QL distinguishes itself from other work in dealing with multimedia objects queries based on spatial and temporal relationships in structured documents. In the following, we describe related standardization work on spatial and temporal specifications and queries. |
FCS, Finite Coordinate Space ![]() | ISO HyTime based on SGML usesFCS to define scheduled structures and events. These event schedules are intentionally designed for HyTime document presentation. FCS defines an abstract and system-independent method of specifying spatial and temporal information separated from content to be presented as event schedules in a multidimensional coordinate space. The design motivation is based on presentation abstraction rather than information retrieval. The indexing scheme support in HyTime is limited in querying spatial/temporal media objects and structures. |
| W3C SMIL is based on XML to define spatial and temporal layouts for SMIL document playout. The layout information is related to media display windows on a screen and media playing time. Thus, the spatial and temporal structures provided in SMIL are also for presentation purpose rather than for storage representation to be accessed. Futhermore, there are structural differences in representation . Often, the presentation forms are not sufficient for storage representation. Spatial and temporal query processing is often less emphasized in presentation-oriented multimedia specifications. |
| SQL/MM and SQL3/Temporal are new ISO standardization projects for extending database query language capability to specify and manage multimedia objects and temporal information in the relational data model. Both are focusing on integration of time- or space- dependent multimedia objects into relational data models for query. However, multimedia document models impose requirements on querying, which are quite different from this relational table model since not only document content but also document structures must be available for retrieval. These proposed query specifications based on relational data models would limit the retrieval capability for document models. |
Conclusion remarks |
| Proposed XML document query languages have some limitations in querying multimedia document objects which are in temporal or spatial relationships. Most of these languages focus on textual and hierarchical structures and have limitations in supporting temporal and spatial relationships of multimedia objects, which are not in hierarchical relationships. In this paper, we tackle these limitations by specifying the multimedia objects as spatial and temporal datatypes in XML. The multimedia object relationships can then be specified by using spatial and temporal datatypes and their operators for XML document retrieval. |
| The main contributions of this paper are (1) to provide a method to specify spatial and temporal datatypes in XML for querying multimedia objects. We illustrate many flavors of spatial and temporal datatypes defined as element datatypes, which can be formalized within the XML Schema Part 2: Datatypes framework. Therefore, multimedia objects and scheduled structures can be smoothly integrated into XML document models for retrieval. (2) to design a multimedia document query language, MMDOC-QL, along with stereotypical spatial and temporal operators to retrieve multimedia objects in XML documents. Many spatial/temporal indexing methods are currently available for supporting and optimizing these spatial/temporal datatypes query processing. |
| Bibliography |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
![]() |
A canonical query language &, its efficient implementation | Table of contents | Indexes | Indexsheets - the "Extensible Indexing Language" (XIL) | ![]() | |||