Dimap : Satellite image metadata standard based on XML   Table of contents   Indexes   Using XML in a Software Diagnostic Tool

Dodds, David
Open Text
 Princeton 
 USA 
 
David Dodds
 Senior Analyst
Open Text
 5 Independence Way Princeton (New Jersey)  USA
Email: ddodds@opentext.com
 Biography
 David Dodds is presently at Open Text where he is engaged in bringing technologies to clients, including XML based solutions, XSL, SVG, RDF. He has done research at Northern Telecom (Nortel) in areas of graphical interfaces and text understanding. David is the author of numerous papers dealing with Robotics and AI.
 

Introduction

  Content Aware Intelligent Web graphics in web browsers encourages financial and other users to do "what if" graphically based presentations intuitively, instantly and without external programming, spreadsheets, or OLE. Such graphics themselves may be queried not only "by value" but also "by content" (using NLP-like constructions, for example, pattern or value matching, or "inexact matching").
 

Part One

 Most everyone is aware that the web has had static pictures available for several years. These pictures, most often, were made available via JPEG and GIF image file technology. Such images were available since the earlier versions of web browsers, particularly since version 3. Later versions of these web browsers provided various means of providing animation of once static pictures. The latest versions of web browsers permit scripting of the "actions" (relocation, etc) which images may be imbued with. "Dynamic HTML" (HTML 4) has provided many capabilities for web images that people only dreamed of in the era of version 2 and 3 browsers.
 Generally speaking, both the static images and the animated images on the web today are constructed using file formats which contain binary data structures which describe an ordered collection of pixels (picture elements). While there are primitive provisions for including a brief amount of text and numeric information as non-displaying annotation within the pixel data structure, there are very few, if any, applications which make any pragmatic use of this capability.
 Web images, even now, are basically digital paintings. They "do" absolutely nothing, they just are. Applications scan the pixel information and display/print an interpretation of the picture information. There is no "picture", per se, in a web image file. They are simply ordered descriptions of colored dots.In the history of the web this was sufficient, for the intention of these "image files" was only to capture a numeric rendition of the rasterization of a real or computer generated image.
 

Times Change

 As the usage of the internet increasingly moves toward a meta-data based "(self) awareness" it is necessary to provide technology which brings meta-processing to "picture" information as well as text. XML provides a means of packaging text (and numbers) withmeta-data containing "element tags". These "tags" can be processed as organized collections via DTDs, DCDs, XML Schema, Business Rules, etc.The Securities Exchange Commission, for example, has a DTD which describes "10Q" and "10K" documents. Companies may submit reports authored with visual editors that use the DTD information to aid in authoring.
 The financial domain, such as securities companies which author a myriad type of paper-based reports may use DTDs to produce XML files which conform to report type. Many of these reports consist of a collection of associated number values and paragraphs of text. This information is presented to the report reader as "tables" and "discussion". Some report types use the static image technology of JPEG and GIF, to display "business graphs" and also tables when it is important to prevent unauthorized alteration of content.These reports were initially published via paper based documents. The only input device intended for the information contained in these reports was thehuman eyeball . If the information contained in the paper reports was to be used in a computer generally it was "transcribed" via keyboard. With the advent of publishing these same report types via the web certain things become clear. While it was faster to send/receive these reports via the web than it was in paper based form, it was also discovered that technology savvy recipients could use image editors to alter the JPEGs and GIFs, and that the increased speed of availability led to web based report recipients wanting the report content to be "automatically placed" into an information repository of one kind or another.
 Documents constructed solely with HTML were found not to lend themselves to being automatically "scanned" and then placed into the repository of choice. Early adopters used XML to "markup" or "tag" the reports so that at least larger grained conceptual sections of these same reports could be "detected" by XML software and automatically "reposited". Adherence to newly evolving standards in the use of element ("tag") names increased the likelihood that a given "report type" could actually be read and transferred properly by some piece of software never originally intended to input this item. Again it was the early adopters of this XML use on the web that realized that just as a technology such as XML could obviously be used as a "stylesheet" and provide "multiple views" from a single XML based financial document these views seemed to be limited to textual and / or relatively simple numerical information. Yes, an XSL stylesheet could change the "font", color, location on the page etc, and it could "render" a given sub-tree of numeric information as variously appearing tables, stylesheets such as XSL did not address some of the items of great interest to the financial community which will be addressed next.The problem of unauthorized alteration of content is stopped dead in its tracks by using digital signatures, like MD5, for example, to "add a tamper seal" to the XML document. This could be done and still leave the document in plain text ASCII. If further security was required then the material between the tags could be encrypted. Some folks looked to PGP for this.
 By "signing" (and possibly encrypting) the number values in the subtrees referred to above it was possible to replace JPEGs and GIFs of tables with a stylesheet based presentation layer.
 

SVG and webCGM

 This paper carries the evolution of the presentation aspect of XML data further. By using the W3C Recommendation of webCGM, it is possible to display business graphics (and scientific and technical graphics, for that matter). Rather than creating a static JPEG or GIF with a graphics package (of business graphics information) and then posting these image files to the web, using webCGM provides the possibility of posting the data itself and have it drawn/rendered by webCGM. In practice this has two complications. The first, webCGM requires data to be in "binary form" (i.e. not ASCII XML), and second, a webCGM aware viewing program must be used to view the webCGM presentations. Presently, there are few, if any, commercial web browsers which provide such a viewer (capability).
 To allow a more flexible arrangement that uses conventional web browsers, we look to the W3C work called SVG, Scalable Vector Graphics. SVG can use ASCII XML unmodified for graphical information input. The same sub-tree of numbers in an XML based report rendered as a table in a web browser by XSL, could be rendered as a business-graph (or scientific or technical graph or diagram) in the same web browser via application of SVG.
 A securities report, published onto the web, secured by digital signature, could utilize its presentation layer to not only provide various "views" of the text and "tables" but could also render any of these "tables" (a stylesheet effect) as an actual (quality) business-graph (in real time!), without any programming required."Tables" could be clicked on with a mouse by financial users and they would be instantly redrawn as business graphs. (Remember that there is no picture file in the XML document for these.) Don't like the xy graph that occurred from the instant re-rendering of the stylesheet based table? Click on it and it instantly redraws itself as a bar chart, a pie chart, or whatever you signify. (You could always maintain a "preference" or "profile" for automatic operation of the presentation layer.)
 You did not copy and paste the information from the rendered table into "Excel". Do you have an (XML based) securities report that has two tables of numbers with text flowed around it? Wouldn't it be easier to compare those two tables if you had an xy graph of each table laid one on top of the other? Click the first table, click the second table. SVG instantly redraws the two tables as a single xy chart with both graphs in the same scale. Want to see just the differences, at each point, between the two graphs? One click and the graph is instantly redrawn as a single xy graph which shows the difference between these other two graphs! Want to see that single xy graph which was the rendering of the table, but "seasonably adjusted" or maybe multiplied by some other weighting xy graph? Click. It is done instantly, in your browser, no program writing, no cut and paste, no Excel, (no OLE).
 

Graphical Metadata

 SVG uses metadata too. This means that Namespace can be used, RDF, etc. Because the SVG file can contain (and actively use) metadata just as an XML textual numeric file can , it is possible, to apply all the very powerful advantages that meta-data and meta-meta-data bring. This includes having a machine readable description, such as RDF, contained in the SVG graphic file itself. By virtue of such technology as Namespace and RDF, a given SVG graphic knows what its content is, knows each line and curve constituting the SVG "picture". JPEG and GIF users only dream about content-awareness processing, SVG XML document users can actually have the real world use of this technology in their company / institution. The SVG definition provides many additional features not mentioned in this paper.
 

Historical Graphics Systems Have Been Unaware

 Historically, graphics systems simply recorded sequences of graphics commands or created raster or bit-mapped fixed files. Even with a capability to store brief amounts of text these historical graphics systems were completely devoid of any capacity to maintain associated or correlated information about the content itself. SVG has been designed not only wonderfully flexible vis-a-vis its graphical capabilities but additionally it has very important capabilities for functional storage of text which can be associated with graphical information at various levels of conceptual granularity
 One obvious usage is as a title of a graphics object. Since the text may contain markup it is possible to output impressive titles via applying stylesheet processing to this text. This is one of the more obvious uses of marked up text. Similarly, the text could be a caption, with sophisticated rendering applied, again via stylesheet.
 By logical and juicious choice of words (perhaps using a corporate dictionary / thesuarus of official terms) titles and captions may be defined and used to convey meaning in a stronger sense than just any arbitrary title or caption. While, for consumption by human eyeball, it is sufficient to employ most any set of words to make titles and captions, when these are constituted by members of "controlled vocabularies" and imploy the phrasal grammars associated with these, we benefit from a powerful additional capability.
 In combination with appropriate markup (in the texts of titles andcaptions) controlled vocabularies / phrasal grammars bring the prospect of conveying meaning not only to human eyes but also to computer programs which input the same text. Application of XML Namespaces, Schema, and RDF type technologies provide a "mechanical" means of defining the "meaning" of words, individually and in phrasal collections.
 When, additionally, terms from domain standards, such as FIXML, SWIFT and so on are used it is entirely feasible for a program to scan the SVG structure and meaningfully determine the components therein.
 For example,when an SVG data structure is scanned and text is found with markup-based indications that it is a caption, of a table, and that the terminology is from the XYZ international standard; then the program performing the scanning has "discovered" (by searching) not only the numerical (and other) values depicted BY THE GRAPHIC, but also, importantly, the program has "discovered" the "meaning" of those numbers.(Generally, the best way to explain meaning (of text) to a computer is to "explain HOW IT IS USED". In many cases "how text is used" can be conveyed to a computer by creating a program which is able to determine the grammar of a sequence of words (a phrase or short sentence) and to develop (by simple search) the context it is within.)
 Being able to search pages by meaning is new and powerful for it permits searches or queries such as ".. compare the return on equity values for ABC company with those of DEF company.
 Also since the meaning of a graphic, in addition to the "values" which it graphically depicts, are discoverable both by eye and by program, it is possible for both "discoverers" to thereby know what operations may be (sensibly) performed on those values.
 It allows a program to examine an xy-graph that a user has just clicked on in his browser and to "recognize" that among the operations that can (sensibly) be performed on this picture is computing the same values and displaying them as a bar-chart. This "perception" can be used to dynamically limit the names in the user's drop down menu to those which "make sense" with the meaning of the clicked on graphic.
 By defining the constituent attributes of graphical scenarios as logical models it is possible to automatically validate graphics. For example, a bar chart in a report should have: the bars, colors for the bars, a depiction of the x and y axis with labels and reticules, a title and an optional caption. It's sort of like a logical DTD for a picture, instead of a bunch of text.
 

Content Awareness brings Capability to the User

 Any graphics system which has a collection of routines built-in which perform such operations as drawing bars, lines, reticules, titles, captions, etc with minimum bother of the user might be called "intelligent" (at least by the marketing people). The intelligence provided is the ability of the system to visually construct major ( and minor) components of the displayed graphic without the user having to define each line, circle, letter, color, and so on. Such systems have been around for a few years and are now largely taken for granted, "this is what a graphics system does."
 None of these systems have meta-data. Because SVG is in XML all the mechanisms in XML technology that provide processing for meta-data, such as RDF, caneasily brought to bear on visual data just as they were originally for textual data.
 Namespaces and controlled-vocabularies bring the possibility of rigorous definition of words (terms) and the concommitant usage of these same terms by programs, not just eyeballs.
 The term Content Awareness is used in the following way. SVG graphical objects are used within a collection.The collection itself has a title and caption, both both made from controlled-vocabularies, sub-collections also have titles and captions (and other markup). As needed, progressively finer grained chunks of the collection have their own markup. This means that at a number of different levels of graphical granularity there is meta-data pertinent to each level.
 This meta-data defines or relates the content. Awareness, of a sort, occurs when a program matches up meta-data in an actual SVG structure with possible meta-data (descriptions) in a discovery program. This content awareness, combined with models of graphics objects in a datastore,allows the system to perform high-level construct validation and also to"know" what transformations and other operations are possible and meaningful.
 It is simple, therefore, for such a system to depict an xml subtree of a sequence of numbers as a table and display the same xml data as an xy-graph, or bar-chart, etc.
 

Querying Graphics

 Everyone knows that databases containing numeric values and / or text can be queried. These queries are done by string matching and number value comparisons.
 SVG graphics are visual presentations on a screen or some other visual media. The visual aspect of these graphics has an underlying description which is executed to make the presentation. This underlying description is number values and words (character strings).
 In this respect the contents of an SVG structure is prettymuch the same as the contents of a database system, numeric values and character strings.
 A query of an SVG graphic "by value" simply searches the xml structure for the number value (or set member) which underlies the lines, circles, colors, etc of the graphic object.Because there is a grammar to SVG and there are tags it is possible to navigate the SVG structure and locate, "by value", xml content.
 A query of an SVG graphic "by content" institutes a search of the SVG xml structure (or its meta-data index, if one is maintained)for meta-data which is relevant to the query.There is much meta-data throughout the SVG xml structure and not all of it is relevant to any given query.
 What distinguishes a "value" from a "content" is the identity of the zml element ("tag") which "contains" it.
 A content query would look something like, "What are the values of the ROE table?" (return on equity). As previously described, the marked up text; in the comment area of the SVG sub-structure which contains its graphical description for the ROE table; contains the sought after meta-data to match against.
 As many of these by context queries are done using corporate-vocabularies, an interface, called Natural Language Menu (NLM), is used to aid the user in entering the query. NLM builds a natural language statement for the user by incrementally constructing it based on a series of mouse clicks, rather than using (error prone) typing.

Dimap : Satellite image metadata standard based on XML   Table of contents   Indexes   Using XML in a Software Diagnostic Tool