Authoring: intelligent templates for authoring of SGML documents   Table of contents   Indexes   XML-Data for Interchange Definition

 
 

GlobalCSCW  (Computer Supported Cooperative Work) with SGML


 PharmaSoft AB  
Strandberg, Ola
 Sweden  
 Uppsala  
 

Ola   Strandberg
  PharmaSoft AB
P.O. Box 1237
 S-751 42 Uppsala   Sweden
Phone: +46 18 185459
Fax: +46 18 109200 Web: http://www.pharmasoft.com
Email: Ola.Strandberg@pharmasoft.com
 
Biographical notice:
 
Ola Strandberg
 
Ola Strandberg has a background in mechanical engineering, but as the mechanics required computer control, he enrolled a Master's program in Computer Science at Uppsala University, Sweden.
 
He has been employed by PharmaSoft for two years and is the father of their workflow system Flow Watch. Ola now works in Core Technologies, evaluating and bringing new technologies into the company.
 pharmaceutical 
 pharmacology 
 

PharmaSoft is a global company focused on improving the overall productivity of the pharmaceutical marketplace by developing information systems build on in-depth understanding of pharmacology and a high degree of technical competence.
 
ABSTRACT:
 Intranet 
 database 
 workflow 
 

CSCW has taken many shapes over the years, lately focused on Intranet technology and workflow technology. Most workflow and Intranet solutions are proprietary, building on relational databases and/or electronic mail for dissemination of information
 
As far as standards are concerned, theWfMC  (Workflow Management Coalition) have been working on a set of standard interfaces and API  (Application Programming Interface) s that enables WfMC compliant workflow engines to cooperate and transfer control from one workflow system to another. By using SGML  (Standard Generalized Markup Language) to describe workflow related information, one can capitalize on the existing experience in converting SGML to other document formats, and "convert" workflow information to accommodate different workflow engines.
 workflow 
 

The traditional workflow technology defines a set of actions that are to be performed on some information by some business role, and possibly "embeds" the tools that are required (like forms based applications). It runs or launches all applications within the workflow client software (or internet browser for certain kinds of information); or it merely sends instructions to the user and identifies the type of information with a MIME  (Multi-purpose Internet Mail Extensions) header. The actions that are to be performed on the information cannot be reused outside the workflow system.
 workflow 
 

The following paper suggests a strategy for marking up information in SGML , packaging related information and setting up profiles for which actions that can be performed on a specific piece of information, which tools to use, etc. The information logic can be separated from the process logic and be reused in other contexts including SOP  (Standard Operating Procedure) s, instructions etc.
 
Three points will be made in this paper:
  1. SGML can act as the intermediate format for transferring control between workflow engines.
  2. Pieces of information that are to "flow" in the workflow can be marked up, with SGML , to capture the actions that are to be performed, any information that is related to performing this action, and do this in a platform and location independent manner. The "information logic" that is defined can be reused outside of the workflow system.
  3. By using a (sort of) microdocument architecture even for traditionally relational data, an SGML / XML browser capable of databinding and stylesheet utilization in combination with a business process modeling tool for this architecture can replace many custom applications of today.
databinding
 microdocument 
 stylesheet 
 workflow 
 
 
 

Introduction

 pharmaceutical 
 

The drug industry faces problems that are shared within many other industries: long development time, complex processes and costly delays. The typical development time for a new pharmaceutical is 10-15 years.
biochemistry
biology
clinical
 drug 
medicine
 pathology 
 pharmacology 
physiology
toxicology
 

During the development of a new drug, extensive cooperation between specialists in biology and medicine, as well as with authorities is required. These include biochemistry, toxicology, pathology, physiology, pharmacology, and clinical documentation. Furthermore, comprehensive documentation is continued even after the drug has been introduced in the market, to take advantage of new findings.
 drug 
 

The typical cost for the development of a new drug exceeds 250 MUSD, and delay of time to market is in the neighborhood of 1 MUSD per day. It is in the interest of all parties, industries and authorities, to reduce this time.
 
A buzzword in the 90's but certainly not a new concept, BPR  (Business Process Reengineering) may aid the drug industry in doing so. Workflow technology has been going hand-in-hand with BPR the last few years and is the focus of the paper at hand.
 
 drug 
 workflow 
 

What is Workflow Technology?

 workflow 
 

Gartner Group defines a workflow as "the sequential collection of all activities used to produce a desired business result". Using this definition, it is clear that workflow is nothing new. Businesses and authorities have always had workflow. What is new is that, thanks to computer technology, traditional workflow paths can be managed in entirely new ways.

 
Workflow Routing

 
 workflow 
 

A workflow system models rules, routes, and roles. Rules are what conditions must be fulfilled for a task to start, etc. The routes are the paths that the information flows along (see Figure ). Finally, the roles define what qualifications a person will need to complete a task. Explicitly modeling these three concepts is an important part of BPR .
 workflow 
 

Essentially, a workflow software will support sequencing of actions and tasks, allocating sub-tasks through role requirements to specific persons. In addition, workflow software will monitor and control the executing business process with respect to the status of individual sub-tasks or the whole process. The software will also link specific tasks to various tools needed to support the task. A workflow software hence plays a crucial role for information logistics, bringing the right information to the right persons at the right time.
 
 

Workflow vs. Document Management

 workflow 
 

Workflow technology is often confused with imaging and document management. Modern workflow technology has its roots in imaging, and imaging is still a large market for workflow software. For workflow technology to be successful, it must be able to control all facets of the workplace, paper based processes included. Today, few workflow system suppliers mention imaging - the images are simply another type of document.

 
Workflow vs. Document Management

 
 
 

What is Wrong with Today's Workflow Systems?

 database 
 workflow 
 

Many workflow systems today are bundled with large software systems like document management or imaging software. These systems focus on routing the information they handle to different users in sequence with different degrees of sophistication. The means of communication is often e-mail or possibly the system's own client software. These systems are difficult to integrate with other data, relational database information, for instance. In many cases the information is just sent with a note attached regarding to the particular task that is to be performed.
 workflow 
 

On the low-end, workflow systems have nothing more than a routing-slip builder for e-mail systems. Other, on the high-end are shipped with their own programming language, forms builder, etc. These high-end systems are most often used to build administrative and production workflows that do not change often. Claims processing and insurance overwriting are typical examples in this category. Here, workflow applications are built with "development tools", and "compiled" to contain all data that is necessary for the workflow.
 workflow 
 

Analogic to the latest Microsoft-takes-on-the-world fight: Universal Database vs. Universal Data Access, most workflow systems, high- and low-end, focus on the Universal Database approach if anything, containing all needed information within the workflow system.
 workflow 
 

What if we wanted to reuse what we have designed in a "looser" environment? CSCW is a broader field than workflow management and may include processes without structure or no "process" at all, just collaboration to achieve a certain goal. With a workflow system like the ones described above this would be difficult. If a workflow is not built, the workflow system can not be used, regardless of what level of "code" reuse the system allows. There is an overhead of building the actual workflow, which often leaves the software on the shelf. This will be further discussed in section .
 
 

Workflow Standardization

 workflow 
 

Standardization work is underway for workflow systems. Most notably, the WfMC have defined a reference model for workflow systems and have also agreed on an API that, when supported, allows workflow systems to cooperate. The WfMC have also together with a number of workflow vendors defined MAPI-WF  (Messaging API WorkFlow Extensions) and a MIME binding to allow workflow systems to communicate asynchronuously.
 
 workflow 
 

The Workflow Management Coalition

 workflow 
 

The WfMC was founded in 1993 as a non-profit, international organization of workflow vendors, users and analysts.
 
The WfMC 's Reference Model defines 5 interfaces:
The Workflow Enactment Service
 workflow 
 

This provides the run-time environment in which the business processes execute. It may involve one or more workflow engines.
Interface 1: Process Definition Tools
 
This interface ensures that a process designed with an external modeling tool can be imported by the Workflow Enactment Service. The definition of this interface is not complete but should provide rich semantics to provide several modeling paradigms.
Interface 2: Workflow Client Applications
 
This is the interface that presents the work items a worker is responsible for. This tool may also invoke tools to present tasks and related data with corresponding deadlines, status, etc.
Interface 3: Invoked Applications
 
This is an interface to any external tools, such as word processors, spreadsheets, etc.
Interface 4: Other Workflow Enactment Services
 workflow 
 

Through this interface, interoperability within and across workgroups and organizations should be supported. This also means interoperability between workflow engines from various vendors.
Interface 5: Administration and Monitoring Tools
 
This is an interface for status monitoring tools to provide a complete status of ongoing business processes and the gathering of historical performance statistics that may be useful for process improvement, etc.
 
 

Point 1 - Using SGML as the Intermediary Format

 workflow 
 

Interface 3, Invoked Applications, is the least finished, and is the focus of next section. As for the other interfaces, SGML can be used to model just about anything, even workflow information. All interfaces mentioned above can be modeled. As in many other cases, SGML has its definite merits since experience in "converting" or rendering SGML information is extensive and a lot of tools exist.
 workflow 
 

Projects are underway to define a standard modeling language to capture all necessary information for workflow engines to communicate information between each other. PIF  (Process Interchange Format) is one that could easily be converted into SGML .
 workflow 
 

In section , a simple workflow is defined in XML .
 
 

Reusing Information Logic - Microworkflows

 
What if we could encapsulate the tools, information sources and the actions performed on a specific piece of information to perform the most common operations on it? We could move towards "intelligent" documents in the sense that they knew where to store themselves, how they should be edited, printed and approved. These operations are processes in themselves - microworkflows.
 
An example of a microworkflow for editing a document could be:
  1. Check out document from document management system
  2. Launch editor
  3. Load document
  4. Enable/disable menu items in the editor as appropriate
 
 
When the performer of the task was done editing the document, the following microworkflow would be executed:
  1. Save the document
  2. Close the document
  3. Check the document back in to the document management system
 
 database 
 workflow 
 

This microworkflow could be called upon from any context, from the workflow system or otherwise. How to actually perform these operations would be platform and location dependent, so that on a PC running Windows, the actions would be ActiveX objects, on a UNIX system it would be a csh script, etc. The strategy is easily extended to handle any type of information, from a physical document to a record in a relational database.
 
 

Point 2 - Marking Up Information with Information Logic

 
Consider the following document:
 
<package id="a1" name="Demo package DTD">
  <document id="a2" name="Assessment report">
    <attribute name="TemplateName">Assessment Report.dot</attribute>
    <profile name="create">
      <action name="Word processor new" op="open">
        <precondition type="attribute">path=''</precondition>
        <execute platform="win32">WrdAct32.FileNew</execute>
        <postcondition type="attribute">path!=''</postcondition>
      </action>
      <action name="Check in to document management system" op="signoff">
        <precondition type="attribute">path!=''</precondition>
        <execute platform="win32">DDMAct32.CheckinNew</execute>
      </action>
      <sop><a href="sops/createdoc.html">Standard Operating Procedure for creating a document</a>
      </sop>
    </profile>
  </document>
</package>
 database 
 

It defines a package that contains one (but can contain other related) documents . Each document has attributes and one or more profiles . A profile describes a microworkflow for executing a certain operation on the document. Note that the document does not have to be a physical document. It may just as well identify a record in a database or something else. In the example above, there is a word document that is created with the action WrdAct32.FileNew using the template Assessment Report.dot. When the author is done editing and the task is signed off, the document is checked in to the document management system with the action DDMAct32.CheckInNew. An SOP is linked to the profile in the form of an HTML  (HyperText Markup Language) URL  (Uniform Resource Locator) .
 
The beginning of the (simplified) DTD  (Document Type Definition) looks like follows:

 
Flow Watch Package DTD

 
 
 

Taking it Further

 
 

Point 3 - Getting Rid of Applications

 
Something has happened very recently; browsers are now considered the environment for most computing needs. Microsoft have implemented DHTML  (Dynamic HTML) in the current version of their browser that allows more client processing of data and have also introduced a new way of data binding, DSOs  (Data Source Objects) .
 
Data binding combined with XSL  (eXtensible Stylesheet Language) stylesheets can be used to build dynamic applications that communicate with XML or SGML . An application is built by using a DTD and creating an empty instance of a "document" (or generating XML  (eXtensible Markup Language) on the fly from a relational database) and stylesheets for each use-case that the "document" may be involved in: a "form" stylesheet can be used to fill in data, different "forms" to edit the data, and finally, a number of stylesheets for viewing the information in different circumstances.
 database 
 stylesheet 
 

The DTD and the stylesheets would together define the whole "application".
 stylesheet 
 

If we take another look at at a package like the one defined above but with the stylesheet approach, it could look something like this:
 
<package id="p1" name="E2B Adverse Drug Reaction Report">
  <document id="d1" name="ADR Report">
    <DTD>"-//PS//DTD ADR.DTD 19980101 Vers 1.0//EN"
    <profile name="create">
      <stylesheet>EnterADR.xsl</stylesheet>
      <sop><a href="sops/E2BGuidelines.html">Guidelines for E2B ADR Reports</a>
      </sop>
    </profile>
    <profile name="review">
      <stylesheet>ReviewADR.xsl</stylesheet>
      <sop><a href="sops/E2BGuidelines.html">Guidelines for E2B ADR Reports</a>
      </sop>
    </profile>
  </document>
</package>
 stylesheet 
 

A simplified excerpt from the EnterADR stylesheet (which would not quite work) follows:
 
<!-- Rule for "Patient" element -->
<rule>
  <target-element type="REACTION"/>
  <TABLE border="1px solid black"
            datasrc="#ADRDB" datapagesize=12
            onreadystatechange=set_page_button_state()>
  <TBODY>
  <TR>
    <TD><INPUT TYPE="select" DATAFLD="ReactionTerm"></INPUT></TD>
    <TD><INPUT TYPE="text" DATAFLD="StartDate"></INPUT></TD>
    <TD><INPUT TYPE="select" DATAFLD="Outcome"></INPUT></TD>
  </TR>
  </TBODY>
  </TABLE>
</rule>
 stylesheet 
 

The ReviewADR stylesheet on the other hand would look like:
 
<!-- Rule for "Patient" element -->
<rule>
  <target-element type="REACTION"/>
  <TABLE border="1px solid black"
            datasrc="#ADRDB" datapagesize=12
            onreadystatechange=set_page_button_state()>
  <TBODY>
  <TR>
    <TD><SPAN DATAFLD="ReactionTerm"></SPAN></TD>
    <TD><SPAN DATAFLD="StartDate"></SPAN></TD>
    <TD><SPAN DATAFLD="Outcome"></SPAN></TD>
  </TR>
  </TBODY>
  </TABLE>
</rule>
 stylesheet 
 

The difference between the two stylesheets is subtle, but the careful reader will notice that the table cells are defined as INPUT fields in the EnterADR stylesheet and defined as SPANs in the ReviewADR stylesheet.
 
All that is missing to build a workflow or collaborate is the actual assignment of these different operations to users/roles and setting deadlines.
 workflow 
 

An example of a simple workflow (a router ) that connects with the previous sample could be described as:
 
<router id="r1" name="New Adverse Drug Reaction">
  <references>
    <package id="p1"/>
    <role id="r1" name="Case Receiver"/>
    <role id="r2" name="Data Manager"/>
  </references>
  <task id="t1" name="Enter New Case">
    <performer authority="performer" type="role" name="Case Receiver"/>
      <profile name="create"/>
    <performer authority="manager" type=role" name="Data Manager"/>
      <profile name="create"/>
      <profile name="review"/>
    <document id="d1"/>
    <instructions op="open">
      Fill in the form...
    </instructions>
    <event type="deadline" value="1998-05-26 11:00(GMT+01:00)"/>
    <nexttasks>
        :
    </nexttasks>
  </task>
        :
</router>
 stylesheet 
 

This router defines a global section of references to be resolved at the receiving system. Among these are the package which was discussed above. It then goes on to define the tasks of the router, which people are authorized to perform the task, which information to work with, etc. The information to work with is the interesting part. The package includes all related information to the router, and the document(s) in each task element defines which information to work with. The profile element in each performer element defines which operations may be performed on the information. The associated stylesheet/microworkflow defines what will actually happen.
 
 

Conclusion

 
SGML/XML can be used to model most kinds of information. It has its definite merits in providing the "base" information type.
 
By marking up information with "information logic", reusable microworkflows can be created, thus creating intelligent documents. Collaboration in a setting like this may be as simple as asking user A to perform operation B on information C.
 Internet  
 intranet 
 

In this day and age of application deployment on the Internet/ intranet, SGML/XML could very well replace custom applications. C implements the operation B, and A merely points his browser to the adress of B specifying operation C as a parameter. The beauty of it is that applications developed this way will be compile free and determining customizations will be as easy as running a "diff" on the original directory and the installation directory.
 
Acknowledgments
    Thanks to Christian Wallgren, PharmaSoft AB, for SGML coaching, and Per Manell, PharmaSoft Inc., for encouragement.
 
Bibliography
XSL
http://www.microsoft.com/xml/
Dynamic HTML
http://www.microsoft.com/workshop/author/dhtml/
Internet Explorer Data Binding
http://www.microsoft.com/gallery/files/datasrc/default.htm
Workflow Management Coalition
http://www.aiim.org/wfmc/
The PIF Working Group
http://soa.cba.hawaii.edu/pif/

Authoring: intelligent templates for authoring of SGML documents   Table of contents   Indexes   XML-Data for Interchange Definition