| XML and Scripting: Using the "HyperWizard" to Drive CGI with XML | Table of contents | Indexes | A Generalized Online Delivery Paradigm for XML Information | |||
Datasheets and Databooks at Fairchild Semiconductor |
| Fairchild Semiconductor Kuhl, Cory Maine South Portland | Cory
Kuhl
Marketing Technical Documentation Manager, Fairchild Semiconductor
Biographical notice Cory Kuhl has been with Fairchild Semiconductor and National Semiconductor for 6 years. Fairchild Semiconductor became a separate company in March of 1997. Ms. Kuhl is the project lead for the Fairchild Technical Documentation system now in place. Her involvement with SGML started in December 1996 when the project was identified. Prior experience is with computer operations, business systems, technical marketing and engineering. |
California ![]() Castro Valley ![]() Price, Lynne Text Structure Consulting, Inc. ![]() | Lynne
Price
President, Text Structure Consulting, Inc.
Biographical notice Lynne A. Price is an independent consultant who has been involved in SGML since 1985. She is an active participant in the international SGML standards committee (ISO/IEC JTC1/WG4) and its U.S. counterpart (NCITS V1) and is the editor of ISO/IEC 13673, the standard on Conformance Testing of SGML Systems. For several years, she was a member of the development team for Adobe's FrameMaker+SGML and left Adobe in February, 1996 to start Text Structure Consulting, Inc. Her interests in structured documentation extend to her graduate school work. In 1978, she completed a Ph.D. in computer sciences at the University of Wisconsin-Madison, writing a dissertation titled “Representing Text Structure for Automatic Processing.” |
| Fairchild Semiconductor Corporation National Semiconductor Corporation databook datasheet | Publishing Requirements |
| Technical documentation at a semiconductor company is one of the most important selling tools the company has. When Fairchild Semiconductor Corporation separated from National Semiconductor Corporation in 1997, each division was faced with the task of duplicating or recreating the publishing environment that had been provided by the combined organization. After analyzing the division's business model, the Logic Division chose to bring the publishing process in-house and built a WYSIWYG SGML environment for creating datasheets, documents, and databooks. Since Fairchild's Logic Division's publishing requirements were not as complex as those of all of National, the new environment was greatly simplified. |
| The division produces documents that include those, such as datasheets and application notes, that typify standard document types in the semiconductor industry. Fairchild publishes these documents, as well as other miscellaneous documents, primarily to the World Wide Web and in bound collections called databooks. These documents consist of black and white technical information containing large tables and graphics as well as text. |
| The first page of a typical datasheet is shown below. |
|
| The Logic Division provides information on the components manufactured by Fairchild Semiconductor to customers and third parties. While it is a long-term goal to make this information available electronically, in early 1998 there was an immediate need to preserve current formats in a changing corporate environment. Without forcing the data through a conversion, a system was needed to produce individual datasheets 6–12 pages long as well as complete databooks. The datasheets are distributed in Adobe® Acrobat® PDF format on Fairchild's web site and databooks are sent in Adobe® PostScript® to a printer's web press for printing in quantities of five thousand or more. An additional requirement was to be able to address the mid-term goal of cost-effectively producing smaller and smaller quantities of customized books. |
| This paper discusses how a system to meet the immediate need was designed, developed, and implemented. |
Prior Production Environment |
| Although some groups used other tools, as part of National, the Logic Division produced documents mainly using the systems and standards defined by the corporate technical documentation group. This group was responsible for 14 different product lines and over 30,000 pages of documentation. To maintain this volume of material, it used a highly integrated document management system and composition system. The mainstream production sequence included the following steps: |
| This process of editing and reviewing was usually done many times per document due to formatting problems, data changes, and missing edits that needed to be completed. |
| When the document was complete, metadata stored with the files in the database was updated to indicate that the document had moved from work-in-progress to a released state. Once released, the document could be transferred to the Web site or used in the publication of a databook. To produce a databook, the vendor's production services staff extracted the individual documents from the database with some assistance in layout from the Technical Documentation Group. |
A New Approach for a New Company |
| When Fairchild separated from National, Fairchild's technical documentation groups lost access to the tools and services of National's corporate technical documentation group. As an independent company, Fairchild runs on a divisional business model. There is no corporate documentation group. Instead, each division of the company is responsible for their own documents and can produce them as they see fit providing there is company-wide consistency in the following areas: |
| The business model and climate of the company also demanded that a new system be cost-effective for the types of parts and documents produced by a single division, and that the new system be installed and productive in a matter of months. |
Different Paths for Different Divisions |
| The parameters for composition of documents was limited to the HTML file for Web publication and the companion PDF file. As long as these documents were produced and entered into the document management system with a specific set of attributes, documents could be published to the Web site. There were no corporate restrictions on how to produce documents with these attributes. |
| Thus, the new company consistency guidelines addressed finished documents but not the methods for producing them. Each division wanted to continue using existing successful methods as much as possible. One division, responsible for a small number of documents, for instance used Adobe® PageMaker®. Another division, that joined the company during the transition period, produced documents using Adobe® FrameMaker®. To preserve systems that were already in place for working with the existing documents, divisions that were not dependent on the old National tools wanted to preserve their existing processes. |
Separate Tools for Document Management and Composition |
| There was a strict requirement for a company-wide document repository, based on a corporate document management system, with the possibility of expanding the system in the future. This requirement combined with the philosophy that allowed each division to choose its own composition tools meant that the document repository was not tightly integrated with the composition system. This separation, resulting from the division business model, the financial business model, and experience to date, was a significant change from previous uses of SGML. Although automatic links between the composition and document management systems may be investigated at a later time, for now there is no automating population of metadata attributes of objects in the database. |
| All Fairchild divisions had to use this system since it was the repository for publication of documents to the company's Web site. Its development included: |
Preserving Existing Data |
| In addition to designing processing for Fairchild's future documents, there were numerous questions about existing data. Existing files needed to move from the old National one document management system into the new Fairchild one. Would each division transfer its data independently? Would the transfer be made at once for all divisions? Did each division work independently? The divisions' document repositories were dramatically different. The Logic Division had over 1500 documents consisting of from 6 to over 40 files each. Some of these documents existed in multiple versions. One of the other divisions that needed to migrate data had 350 documents (each with only two files) and the other had fewer than 100 documents (containing 6 to 40 files each). Since it had a much greater volume of data to process and a bigger documentation staff, the Logic Division lead the migration effort for the company. In this process, many of the metadata attributes and procedures for releasing information to the Web site were retained. |
Business Model for the Logic Division's Composition System |
| Although the other two divisions planned to continue use of their existing PageMaker and FrameMaker composition software, the Logic Division needed to replace its existing tools. Replacement was needed both to compensate for the lost access to the National facility and to improve on the process. The division had produced over 10,000 pages of technical documentation in SGML using three DTDs for three different types of documents: datasheets, application notes, and miscellaneous documents. Although this mass of existing documentation had to be preserved, the batch composition model was not working well. There were two major problems: |
| In light of this experience, the replacement system needed to use SGML so it could process the existing data, but it also needed to be easy-to-use, easy-to-change, and interactive. Replacing the old batch system made users more productive and gave them instant feedback on how their documents were formatting. |
Composition System Requirements |
| In attention to reducing dependency on a batch system and preserving existing documents, the new composition system had to meet several additional requirements. These requirements are itemized in the following subsections. |
Using Existing DTDs |
| All existing SGML documents used the same three DTDs. Over the course of several years, many individuals participated in DTD design, spending countless hours ripping apart and categorizing their documents. The resulting document definitions were used to create the files that resided in the database. Not only was it not necessary to recreate this structure, it was a requirement to avoid doing so. |
| These DTDs were not designed as authoring DTDs, but were used in converting data from an earlier non-SGML system to SGML. While there were no resources to take yet another look at the DTDs, there was no requirements to support the DTDs in entirety. In particular, while the DTDs defined over 250 elements, the Logic Division's documents used only about 60 of these. As a simple step toward refocusing the DTDs to an authoring model, the unused elements were deleted from the authoring model. When the new system displays a context-dependent list of available elements to the user, it only includes those elements actually used by the Logic Division, even if the DTD permits others. Similarly, attribute support was limited to those attributes in actual use. |
| ECIX Electronic Component Information Exchange Pinnacles Electronic Databook Project | Using the Semiconductor Industry-Standard DTD |
| The DTDs are based on a particular version of the ECIX Electronic Component Information Exchange (formerly Pinnacles Electronic Databook Project) DTD used throughout the semiconductor industry. (Information on ECIX is available at http://www.si2.org/ecix.) |
One notable feature of this DTD is the
|
Supporting a Single Version of the DTDs |
| The DTDs evolved over time and the repository contained documents using different versions. In the old system, documents based on an out-of-date DTD were either updated manually when they were needed, or converted with a PERL script. Since there was no requirement to extend the processing capability of the old system, the same techniques were sufficient for the new system. |
Special Character Support |
| The system had to support the special characters that are required for the semiconductor industry. The font that supported these special characters was transported effortlessly from the old system. |
Required Output Formats |
| While the focus of the new system was creation of SGML documents to store in the database for later use, other outputs were required as well. In particular, the new system had to be able to produce PDF and HTML files for the Web environment. XML versions of the documents may be needed in the future. |
Databook Generation |
| There was a requirement to generate a databook from a collection of documents of the three types. Formatting of a stand-alone document and a document within a databook is similar, but there are a few differences. Certain sections of datasheets for related products are identical and this material is printed only once in a databook, although it does appear in the SGML representation of each datasheet, and is included when a datasheet is produced in stand-alone form. Other changes include the addition of cross-references to other documents in a databook when documents are collected together and a different page number style. There was a requirement to preserve the general graphic design of databooks when the system changed. |
In-House Format Design |
| A crucial requirement for the new system was to reduce dependence on outside vendors. One aspect of this requirement was the ability to control the formatting of the documents internally. Someone on the documentation staff at Fairchild needed to understand how the system worked well enough to change the “look and feel” of the documents. |
Ability to Expand |
| The new system had to be expandable. Although there was an immediate need to preserve the capabilities being lost with the old system, further ways to automate document production had to be possible after initial implementation. One important possibility is to allow engineers to input data directly into the system, but still have the formatting done automatically and consistently. |
Integrating Composition and Document Management |
| The process of editing documents needed to be simplified by tying the document management system to the composition system. Software support for this integration is important. |
Responsiveness to Fairchild Requirements |
| The new system had to be an application customized for Fairchild's uses. At the same time it had to be created to a predetermined schedule and budget. Fairchild wanted to work with a developer who knew the underlying software thoroughly, maintains contacts throughout the field (including the organization that produces the software being used), and has a reputation for doing extremely quality work. |
| FrameMaker+SGML Text Structure Consulting, Inc. | Composition System Development |
| Based on the criteria outlined above, the Logic Division selected Adobe® Framemaker®+SGML for its composition system. It was the only product on the market that allowed for a fast-lane approach to implementation and had the ability to display the formatted document on the fly. Text Structure Consulting, Inc. was chosen as the system integrator to develop the application. |
| The development of the composition system was on an extremely tight schedule. The existing production environment tools were on loan from National for a limited time and the date of termination of services approached rapidly. A production-ready prototype, sufficient to keep the Logic Division in the business of producing datasheets was needed a mere four months after the detailed project scope was defined. |
WYSIWYG ![]() | WYSIWYG—Almost |
| Although FrameMaker+SGML is a WYSIWYG document preparation tool, this application is based on authors seeing a slightly different form of the document than is published. There are three types of differences: |
|
| The solution was to define three forms of the document instead of just two: |
| This approach allows users to see how individual paragraphs, tables, and graphics are formatted as they work. Final page layout cannot be seen until the print form is created, but since it only takes a short time to prepare the print form (15 seconds for a typical document on a medium-capability Microsoft Windows system), the real impact to the users of the two-form approach is minimal. |
Processing Databooks |
| The user lists the documents that appear in a databook with a fourth type of FrameMaker+SGML structured document (a structured document is essentially the WYSIWYG form of an SGML document) called a book directory. The book directory uses a very simple DTD. It allows users to group the listed documents into sections and specify titles for the sections. Users can also specify front matter and indicate where to place the table of contents and index. They enter documents into the book directory by choosing them from a file browser. |
| When editing of the book directory is complete, the user issues a command to process it. This command instructs FrameMaker+SGML to create a FrameMaker+SGML book which contains the book forms of the individual documents and the populated table of contents and index. A PostScript or PDF file is generated from the book and sent to the printer for production. |
|
Oh, What a Tangled Web We Wove |
| One lesson that quickly became clear was that some of the SGML markup in the existing data had been motivated by the resulting formatting rather than the element structure it specified. Some of these “hacks” did not translate well into the new system and had to be undone so that the incorrect coding was not propagated. Some of these corrections were automated, but time and budget constraints forced others to be manually corrected as individual documents were moved into the new environment. The time spent in this editing was the price documentation staff belatedly paid for shortcuts that had been taken to fool the formatting engine of the old system. |
| FDK Frame Developer's Kit | Development Tasks |
| Developing the FrameMaker+SGML application included the following major tasks: |
Moving Data Between SGML and FrameMaker+SGML |
| In general, a document's element structure is identical in its SGML and FrameMaker+SGML forms. There are a few exceptions, however. |
|
|
FrameMaker+SGML Plug-Ins |
| FrameMaker+SGML is deliberately designed to be extensible through the FDK. About half the development work involved implementing the FDK clients. These clients allowed the Logic Division to work with its existing data without manual modification, to correct some errors in that data automatically, and to provide users with a simple user interface to new functionality. Some of the features implemented with FDK clients are described below. |
|
|
|
Problems and Work-arounds |
| As is to be expected in a project of this type, some unexpected problems arose during implementation and ways to avoid them had to be found. The problems included: |
Future Enhancements |
| The new tools were implemented and installed within initial budget estimates in time to prevent any down time between use of the old system and its replacement. Documents are being produced and published successfully. Now that the system is in place, the Logic Division is considering enhancements. Possible additional functionality includes: |
| Minor improvement to the document formatting and some of the implementation strategies are also being planned. One technique in the FrameMaker+SGML application was to maintain all page layouts manually in order to minimize the amount of FDK code that needed to be written. Since the different page layouts are similar and manual maintenance is quite subject to human error, a better strategy would be to automatically combine layout components, but to maintain the individual pieces manually. |
A Final Word |
| To conclude, while the bulk of this paper has dealt with technical issues, some of the political and financial lessons learned throughout this project should be mentioned: |
| XML and Scripting: Using the "HyperWizard" to Drive CGI with XML | Table of contents | Indexes | A Generalized Online Delivery Paradigm for XML Information | |||