| XML'99: the dreams and the reality | Table of contents | Indexes | The Marriage of XML and Databases | |||
The Interchange of Mathematics in XML: MathML, OpenMath and their Application |
| Stephen Buswell |
| Director, Research and Development |
| Stilo Technology Ltd
Empire House, Mount Stuart Square, Cardiff CF1 6DN United Kingdom Phone: +44 (0) 1222 483 530 Fax: +44 (0) 1222 488 498 Email: sb@stilo.com Web: http://www.stilo.com |
Biographical notice: |
ABSTRACT: |
| XML Maths |
Introduction |
Introduction - Why Write Maths in XML ? |
With these issues in mind, we will now consider two XML encodings for mathematical data: Mathematical Markup Language (MathML) and OpenMath. |
Mathematical Markup Language (MathML) |
Introduction to MathML |
MathML Features |
A very important point to note is that the uses of Presentation and Content element subsets are not mutually exclusive: a Content construct can have embedded presentation information (overriding the default presentation specified for that element). Similarly presentation constructs can have embedded content structures. |
MathML Presentation Elements |
The MathML presentation model offers a heirarchical, "boxes within boxes" mechanism allowing the user to control the size, positioning and spacing of all the components of the expression to be displayed. It aims to achieve the same expressive power as TeX. |
Definitions
|
MathML Content elements |
Clearly, providing a semantic model for the whole of mathematics would be a huge if not unachievable task. Not only this, but the day it was finished, it would be rendered out of date by the development of some new branch of mathematics. The MathML WG therefore set itself the task of providing basic support for "K-12" mathematics, that is for mathematics from kindergarten up to the end of high school or the first year of university education. The areas covered in the current MathML recommendation (1.0) are listed below. This list is under revision in the ongoing MathML review process. |
MathML Content elements have well-defined default semantics. That is to say that, for example, the symbol "sin" is not only the name of an element in the DTD, but also has the mathematical semantics which a user (or a mathematical application) would expect. Two communicating applications can therefore rely on a consistent interpretation of the symbol. These default semantics can be overriden by the user if desired. |
MathML has an extension mechanism which allows a user to create a new symbol, for example to represent a function not in the predefined K-12 element set. This mechanism can also provide a reference to an external definition of the semantics of such a user-defined symbol (effectively extending the semantic scope of MathML). Such an external definition could take many forms: a reference to a standard text or a function in a well-known computer algebra system, for example. One possible method for the formal, machine-readable, definition of these semantics is supplied by OpenMath. |
Definitions
|
Tools and Products Supporting MathML |
One of the most interesting and encouraging aspects of the definition process of MathML has been the parallel emergence of tools and products supporting the emerging recommendation. Indeed, the experience of the implementators of these tools has provided significant input to the development and refinement of MathML itself. Some MathML tools, available or under development, are listed here: |
Definitions
|
MathML and Other Standards and Recommendations |
As a consequence of being as one of the first application specific recommendations issued by W3C, MathML has become one motivating example for many of the W3C workgroups,for example schemas, flow-objects and formats, query languages. The MathML working group has therefore been able to provide input to the various groups occupied with these areas. The working group is also in discussion with the two major browser manufacturers concerning the implementation of native MathML support within the browsers. |
The ESPRIT OpenMath Project |
Overview of the OpenMath Project |
Closely related to MathML is the european ESPRIT project OpenMath . OpenMath came into existence as an informal group in the academic and industrial Computer Algebra community interested in inter-application communications of mathematical objects. In 1997 ESPRIT, the Information Technology programme of the European Community, commenced support for a project involving nine European members of this group to define the standard in detail and develop supporting technology. The wider grouping continues in the global OpenMath Society which co-ordinates the efforts of various OpenMath-related projects and activities around the world. In particular, there is close co-operation with the North American OpenMath Initiative (NAOMI). |
Openmath aims to develop a standard for the interchange of semantically-rich mathematical objects between communicating applications. |
|
Fields of investigation include mathematical databases, specialist processing systems, interactive and distance learning and distributed mathematical software. The OpenMath project is also developing tools and prototype industrial applications in areas of particular interest. Public seminars are held to disseminate information and generate user feedback. |
The Openmath Technical Approach |
OpenMath takes a slightly different approach from MathML to the encoding of a mathematical expression. The core OpenMath language is very small, but closely associated (although outside the formal syntax) is a vocabulary of symbols which are defined in a (freely extensible) set of Content Dictionaries. OpenMath 'symbols' represent functions, operators, variables and so on. |
An OpenMath object itself has an abstract syntax which is a recursively extensible tree of symbols and objects. An object can have a representation in one or more formats: XML or binary for example. This differs from the view in MathML where an object is defined by its XML representation. Nodes of the tree can be attributed.The attribute itself is an OM object and encodes contextual information not derivable from the data. |
A Content Dictionary (often referred to as a CD) specifies the mathematical semantics of an OpenMath symbol: type signatures; formal mathematical properties (eg. associativity) and so on. OpenMath objects can have more rigorous semantics than MathML constructs. For example, there is support for formal type systems and type-inference mechanisms. |
A given CD contains symbols from a particular area of mathematics such as Linear Algebra: two OpenMath-aware applications can communicate if they process the same CD. This allows specialist systems to be OpenMath-compliant by implementing a minimum interface, possibly even only for one CD. |
The project is developing a core set of CDs: others can be defined at will by OpenMath users. Developing a new CD does not require any change to the core OpenMath language, or affect the compatibility of previously developed applications. It is this clean separation between the core language and the semantics of symbols in newly-created CDs which gives OpenMath its extensibility. |
The core set of CDs numbers (at the time of writing) some 28. The reason that there so many small CDs in the core is to allow easy selection of tightly-defined application-specific subsets, called "CD Groups". For example, a subset can be chosen so that the semantic scope covered by the symbols in these CDs is equivalent to the scope of Content MathML. This equivalence allows a precise definition of OpenMath-MathML compatibility and gives a firm basis for the development of conversion tools. |
Alongside the 20 CDs in the "MathML equivalence" CD Group, there are another 8 providing some basic symbols not in MathML, and a basis for a formal type system. In addition to the core, the project is developing further some areas of special interest, including Algebra, Polynomials, Group Theory and Theorem Proving Systems. |
The Synergy between MathML and OpenMath |
Although OpenMath started out as a completely independent activity from MathML, by the time the ESPRIT project was running, it was quickly realised that MathML had become an important part of the mathematical universe. The project has therefore taken a direction to avoid duplication of the MathML work, or any possibility of 'competing' standards, and to work on the synergy between the two and the development of interchange technology. |
MathML and OpenMath are in many respects complementary. OpenMath can provide an extension and formalisation of the mathematical semantics in MathML. In turn MathML offers a visualisation and publication mechanism for OpenMath. In addition, many tools are likely to provide interoperability. (This is particularly related to the fact that both notations have XML encodings). The relationship between the various parts of MathML and OpenMath can be visualised as below: |
|
We can expect to see OpenMath in use in areas where the formal mathematical properties of an object are paramount - in specialist engineering applications, research and so on. MathML will become widespread for example in education and publishing where the visual aspects of a mathematical expression may be as important as (or more important than) the underlying semantic content. The balance between MathML presentation and content will depend on this relative importance. |
As there is a clear common interest between the scopes of OpenMath and MathML, it is also inevitable that many expressions will start life in one form and at some point be re-represented in the other. One interesting example of this occurs in a prototype being developed by the ESPRIT project. Here MathML mechanisms are used to provide a browser interface with a visual rendering of an expression - this is then transformed into OpenMath for processing by a commercial numerical library before re-transformation of the processing results for display. The intention is to develop a model for an intelligent interface to the user documentation of the numerical library. |
Conclusions |
Both the MathML and OpenMath encodings preserve the internal substructure of an expression, both from the semantic and, in presentation MathML, layout viewpoints. The information encoded is therefore recoverable and reusable, in whole or in parts. This offers the prospect of interchange, both vertically and horizontally. The same encoding used for a piece of mathematics in a print-publishing system can be re-used in a web page or electronic journal. The mathematical semantics of an expression can be passed between web pages, interactive textbook CDs, and specialist mathematics servers for processing, display and user interaction. This approach should find many useful applications in education and research, publishing, engineering and allied fields. |
Acknowledgments |
The author explicitly acknowledges the contributions of all members of the W3C Math Working Group to the MathML recommendation. Parts of this paper have derived significantly from this work. The work of the OpenMath community, and in particular the valuable contributions of the team members of the ESPRIT OpenMath project are also acknowledged with thanks. |
|
Bibliography
|
| XML'99: the dreams and the reality | Table of contents | Indexes | The Marriage of XML and Databases | |||