| Guidelines for using XML for Electronic Data Interchange | Table of contents | Indexes | Regulations Worldwide Online at the
Siemens
|
|||
Implementing a Link Editor |
|
Eduardo Gutentag |
| Staff Engineer |
| Sun Microsystems, Inc. 17 Network Circle, MPK17–102 Menlo Park California USA 94025 Phone: (650) 786-5498 Fax: (650) 786-5727 Email: eduardo@eng.Sun.COM |
Biographical notice: |
Eduardo Gutentag |
ABSTRACT: |
SunSoft ![]() authoring tool customization external links internal links |
The implementation of a link editor as a customized layer on top of the authoring tool was recognized very early by SunSoft as having the highest priority. The link editor allows writers to point and click in order to create internal and external links without having to know the first thing about ID/IDREFs or about FPIs. Many writers and implementors will recognize the need for this tool in their environment. |
The Background |
|
AnswerBook ![]() Sun ![]() SunSoft ![]() |
In 1991 SunSoft (an operating company of Sun Microsystems, Inc.) introduced the first AnswerBook technology as part of the Solaris environment. AnswerBooks we re one of the first attempts in the computer industry to deliver online documentation with a reasonable navigation method and a powerful fulltext search engine. |
At the time we thought that AnswerBooks would meet our needs for many years to come. |
| postscript |
Unfortunately, AnswerBooks were based on a proprietary manipulation of PostScript. That was bad enough, but on top of that it soon became clear that the one-to-one correspondence between the online and the printe d page presented more problems than it solved. Among them were the inability to cut and paste and the inability to generate other formats from the source material. |
SGML ![]() |
So in 1994 we decided to revamp the whole system from the ground up, using SGML (Standard Generalized Markup Language) . |
WYSIWI G conversion ![]() migration ![]() |
Of course, if we were going to use SGML, we needed either a reliable conversion system to go from the WYSIWYG (What You See Is What You Get) editor in use then to SGML every time we needed to produce SGML, or we needed a good SGML authoring system. And, since there really is no method to convert cheaply from unstructured to structured markup with 100% reliability and no human intervention, we decided to migrate to an SGML authoring environment. |
This kind of migration is not easy, as some of you may know. |
The Problem |
|
Adept DTD, Document Type Definition ![]() DocBook |
So there we were, sometime in the middle of 1995: we had already chosen Adept*Editor from ArborText, the most robust editing and printing tool that the market offe red at that time, and we had already chosen the DocBook DTD as the basis for the DTD that our writers were going to use. |
And I had already started having nightmares. |
In my nightmares I was being confronted by a group of confused writers, and a Question and Answer session developed between us: |
Q: How do I insert a link to a chapter in my book? |
| Gentext |
A: Do you want a gentext link , or a link with authored text? |
Q: What is gentext? |
| FOSI |
A: That is where the FOSI (Format Output Specification Instance) inserts the text of the title of the element that you are targeting as the linkend of your link. |
Q: What is a FOSI? |
A: It's a Format Output Specification Instance. It's used for formatting. |
| Style sheet |
Q: But I heard we were using style sheets... |
Q: What is “the linkend of my link”? |
A: That's the target of your link, the thing you are cross-referencing to. |
Q: So how do I do it? How do I tell my link to go somewhere? |
Q: Say what? |
Indeed. Say what. |
The solution |
|
crisis training ![]() |
One early answer to this quandary was to rely on good and solid training. But training only goes so far, is very expensive, and still doesn't solve the problem of the writer who's hi red to fill in a position that's been open for a couple of months and is told by the manager ”Look, we have a crisis on our hands; you'll get your training after we finish with it. For now, just go and do your job.” |
The idea of hiring only SGML-trained writers was considered, discussed, examined, and rejected in about one minute. Those writers simply do not exist. |
It was at about this point that I said that of course this problem wouldn't even arise if the application had a good link editor. |
The words “link editor” became, at that point, the magic mantra that would save our day. And I was assigned to design and implement it. |
design ![]() principle |
The design |
The Link Editor's design has not changed in the past two years,as it went from a couple of words tossed carelessly in a brainstorming session to a concrete reality. |
One of its first elements is that whoever uses it does not have to know the first thing about SGML or about the DTD in use . |
This basic design principle established the basis for all the rest: |
|
As you can see from the above, some of the design principles, when condensed in one or two sentences, seem pretty basic, not to say lame, obvious and not worth the time it takes to enunciate them. |
However, the implementation strategies forced by these “brain-dead” principles can be anything but. |
| implementation |
The implementation |
link editor ![]() links ![]() target selection |
One of my first programming teachers once said that a good programmer never writes a line of original code but instead “borrows” code from existing applications. |
![]() |
and changed it into: |
![]() |
The main visible difference is what appears at the top of the panels, and the fact that the “Find” and “View” menu items are replaced by “Switch Modes”, which, when selected, shows: |
![]() |
| ease-of-understanding |
This allows users to select between the four possible types of links they can author. Note that the aesthetics of a well proportioned menu have given way to ease-of-understanding requirements. |
link ![]() ulink xref |
Th e appearance of the “Links” mode of the link editor is exactly the same as that of the “Xrefs” mode; the “Ulinks” mode is actua lly just a little help window with a “click here” notice that loads the mouse, as it were: |
![]() |
And, once the mouse is loaded, a dialog appears, prompting for the desired URL. |
![]() |
After entering the URL, the writer is prompted for “hot text”: |
![]() |
If the writer enters anything, that is what will become the “hot text” in the output; otherwise the URL itself becomes the hot text . |
olink ![]() |
The appearance of the “Olinks” mode is very similar to that of the “Xrefs” and “Links” ones, but the behavior is somewhat different. Where “Xrefs” and “Links ” present the hierarchy of a single document, the “Olinks” mode first presents a list of all the AnswerBooks and Collections that there ever were. Each one of them is expandable into the list of books that it includes, and each of the books, if written in SGML, is in turn expandable into the legitimate link targets that it contains. |
insertion ![]() internals |
The internals |
Internal links |
|
PI ![]() Processing Instruction target location |
In order for the above to be accomplished, each line in the Link Editor, when in “Xrefs” or “Links” mode, carries P I information about the location of the target in the document. For example |
<?Pub Lcl Command="exec xtoc::linktop_oid('56 ','(13,1,56)')"
> |
double-click ![]() |
A double-click the above target in the Link Editor loads that information into memory, and maps mouse double-clicks to the execution of that command. When the user next double-clicks in the insertion are a, the executed function obtains the id value of the location that was put in memory, writes the link into the document with this new information, and re-maps mouse double-clicks back to the default. |
External links |
|
olink ![]() |
When the Link Editor is in “Olinks” mode, however, the internals are quite different and, of course, a bit more complicated. |
AnswerBook ![]() database ![]() |
Initially the Link Editor shows all AnswerBooks and Collections ever registered in the database, whether actually published or not: |
![]() |
PI ![]() |
Hidden in each of the lines and completely invisible to the user, there is a PI containing the following information: |
<?Pub Lcl Command="xtoc::display_books(47,5,'Solaris 2. 7 System Administrator Collection')"> |
database ![]() double-click ![]() |
When the line is double-clicked on, the display_books() function is executed, and it retrieves from the database the list of books contained by the Collection 47.5, which is how it is known to the database |
![]() |
![]() |
olink ![]() targetdocent |
And, once again, clicking the guillemet quotes contracts the book; clicking the book name loads the mouse to insert an |
<olink targetdocent="BINARY" localinfo="INTRO-23217" type="V-ONLY"><quote>To Be or Not to Be 64–bit</quote> in <citetitle>64–bit Solaris Application Develpment</citetitle></olink> |
| entity declaration |
At the same time, a declaration is introduced in the document, if it doesn't exist already, of the form: |
<!ENTITY BINARY PUBLIC "-//Sun::SunSoft//DOCUMENT BINARY Version 2.0//EN" NDATA sgml> |
AnswerBook ![]() AnswerBook2 URL ![]() |
Once the document is published as part of an AnswerBook2 collection, when the user double-clicks the hot tex,t the AnswerBook2 server translates all this information into the URL needed to locate the targeted book and the internal location of the target. |
database ![]() |
The Database |
relational database ![]() |
Our database, although it could be called “primitive” in that it is a straightforward relational database at this point, contains all the information we need about all the books and col lections, and then some more. |
| book version part number short name |
Each book is assigned, before anyone actually starts writing it, a part number. Each book also has a short name associated with it and with the f amily of books to which it belongs. For instance, the book on Solaris 2.5 System Administration belongs in the same family as the book on Solaris 2.6 System Administration, and they both therefore share the same short name. However they each have a different version number, and a different part number. |
AnswerBook ![]() |
Each book can also be associated with more than one AnswerBook collection. |
Each book's canonical title is registered in the database, and so is each AnswerBook's title. |
repository ![]() |
When a book is checked-in to the repository, the integrity of the data is verified against the database, and the book is accepted or rejected by virtue of its compliance (among other things). At the same ti me, the book is scanned, and a list of targettable contents and their IDs is obtained. This is the list presented to the writer when a book is expanded in the Link Editor. |
But why go to such lengths? |
| link degradation link upgrade |
If in the example above there was a link going to part number 805–1123 (the Solaris 2.6 System Administration book), how would one be able to determine that, in it s absence, it is ok to link to part number 802–1432 (the Solaris 2.5 System Administration book)? There is no relationship between the two part numbers. |
FPI ![]() URL ![]() URN ![]() |
However, because the link actually goes to the EN version of Version 2.0 of SYSADMIN, it is possible to degrade or upgrade to the JA version or to Version 1.9 of the sam e. In this sense, the use of FPIs to indicate link targets can be compared to the use of URNs to locate URLs (the difference being, of course, that URNs are still not there). |
The auxiliary functions |
|
In order for the link editor to function correctly, auxiliary functionality must be also in place. This functionality is
|
ID generation ![]() |
Automatic ID generation |
The automatic ID generation functionality that we have in place currently is far from ideal, although it performs properly and does everything that's needed. |
I mention ideal because, at least theoretically, keeping track of the last assigned ID number should be done through a text entity; however, there are practical reasons why we cannot do that yet. |
Similarly, a writer can run a specialized function to insert IDs throughout the book at any time. This is especially useful after doing a copy-and-paste as indicated below. |
ID duplication callback ![]() paste ![]() |
Paste callback |
One of the most bothersome issues when trying to keep one's document clean of duplicated IDs is what to do when pasting text that contains IDs. |
This is generally the source of most duplicated IDs. |
If the user answers in the affirmative, the buffered text is traversed, and any ID attributes in it are deleted before pasting the buffer in. |
link update ![]() |
Link Updates |
Whenever a document contains a link to another document the risk exists that the text referring to the target can become out of sync with the text that actually exists in the target. |
Without going into the most intimate details of how our link updating tool works, I can say that its operation is relatively simple. |
The author is first presented with the list of all the books the document links to: |
![]() |
AnswerBook ![]() |
After selecting one of them, the writer is presented with the list of all the AnswerBooks collections in which that book appears: |
![]() |
AnswerBook ![]() |
After selecting an AnswerBook or Collection, the writer is presented with a confirmation panel. |
![]() |
After confirming the choice, the writer goes to the next book, and so on until all the books are done. |
At the end of the process all entity declarations are updated to the right version and language, and all the link text is updated to reflect the latest available version of that particular book. |
support ![]() training ![]() xl xpointer |
The moral |
The question of whether something as conceptually simple as linking merits this kind of development effort is a legitimate question to pose at this point. |
Wouldn't training be a better way to go? |
Why can't writers just insert the IDs by hand? |
Why can't writers just access the database through other methods when they want to link to external books? |
Why can't writers accept that writing in SGML is hard, and just get over it, and if they don't like it, tough luck? |
In short: is it worth it? |
The answer, from my perspective, is a resounding yes. There are a variety of reasons why. |
And finally, because who among you would be have shown any interest in a talk about link training? |
| Guidelines for using XML for Electronic Data Interchange | Table of contents | Indexes | Regulations Worldwide Online at the
Siemens
|
|||