Interoperability of DOE Visualization Centers:
Technical Issues and Approaches

Nancy Johnston, Editor

Introduction and Purpose

This report is the product of the "Workshop on Interoperability of DOE Visualization Centers," held from March 30 to April 1, 1998, in Berkeley, California, and sponsored by the DOE Mathematical, Information, and Computational Sciences Office (MICS). The purpose of the workshop was to address technical issues relevant to the interchange and sharing of visualization research and software. The goal of the workshop was to produce this report, which identifies feasible solutions to the challenges posed by using tools developed in unfamiliar and sometimes incompatible environments.

Visualization experts from the national laboratories, other federally funded institutions, and academia were invited and attended this workshop. They will be referred to as "the attendees" throughout this document. Appendix A is the list of the attendees.

WHAT IS INTEROPERABILITY, AND WHY DO WE WANT IT?

Interoperability in the broadest sense is a continuum of capability that ranges from the informal discussions of ideas and algorithms to the sharing of software modules. We agreed on the following definition of interoperability in a stricter sense:

Interoperability is the idea that different developers, working almost entirely independently, can contribute software components to a common, quality-assured collection (e.g., a repository) AND that components can be easily obtained from this collection and easily combined into larger assemblies using a variety of interconnection mechanisms.

This definition is derived from a number of factors that influence the successful sharing of software between developers. For example, the idea that developers can work "almost entirely independently" is crucial. The more that human interaction is needed in order to share and exchange software, the less likely it is to happen over a large number of developers such as the DOE community. The idea of a "quality assured collection (e.g. a repository)" is also critical. What good is it to a developer if the software s/he can obtain through interoperating with other developers does not work or performs poorly? The idea that "components can be easily obtained from this collection" is another very important factor. In order for a developer to use someone else's software, s/he must first know that it even exists. Next, it is just as important to be able to easily locate and obtain the actual software. There are other issues as well. One is the granularity of the software components that are shared. Are they code fragments, libraries or whole executable programs? Another is the ease with which a component can be integrated into new software settings.

None of the attendees believed that we have a framework in place today to achieve the kind of interoperability defined above. More research and development is needed. What we can do now is start taking the steps needed to reach this goal. For example, at a basic level, we need the ability to share data. Users of today's visualization systems usually have to convert their data to a different format to try out a new or different visualization package. Often, this conversion is a painstaking process.

Why do we need or want interoperability? Some of the benefits of interoperability are easy to enumerate. If we can share code or code fragments, development is faster, duplication is avoided, and the cost of development is reduced. New functionality is easier to add to codes, and users will have a wider choice of capabilities to choose from. Researchers will better be able to ask "what if" questions. As more people use various pieces of visualization software, there is a better chance to catch and reduce bugs and to generalize the software for different computing environments.

Users of visualization software often have a different view of interoperability than software developers. A user may be more interested in ease of use, consistency of the user interface, and user interface terminology that reflects the terminology used in his or her scientific field. The developer may be more interested in portability issues, flexibility, and access to the source code. Both user and developer are interested in robustness, performance, functionality, and portability.

COSTS / BARRIERS / RISKS OF INTEROPERABILITY

The attendees identified three categories of barriers to interoperability: cost, technical, and psychological.

A major concern of the attendees was that interoperability is commonly perceived as being either free or easy to add to prototype research software. Interoperability is not free, and additional work is needed to make any software production-quality. For example, industry experts in software engineering estimate that the cost of developing production-quality software is easily 5-10 times that of the original research prototype. Visualization research activities are not well funded, and research will suffer if interoperability requirements are added without additional funding.

Some of the technical barriers the attendees identified include:

Psychological barriers include a lack of motivation or a feeling of "why bother." For example, a developer might ask "What will I get in return?" Frequently, developers are being pushed by day-to-day, short-term needs of their scientific programs, and producing a software module that can be used by others is not highly regarded by their management.

Another barrier is each laboratory's desire to hold onto its intellectual property rights. If we exchange software and changes are made to it at another site, who owns the result? The consensus of the workshop was that DOE headquarters needs to deal with this issue.

WHERE ARE WE TODAY?

The attendees agreed that today most interaction is done on an informal basis at conferences or meetings. Members of the visualization community can meet and exchange information at three regularly scheduled conferences: the ACM Siggraph Conference, the IEEE Visualization Conference, and the DOE Computer Graphics Forum (DOECGF). The first two are conferences where one goes to learn about the latest research in the field of computer graphics and visualization.

The DOECGF is an informal meeting of representatives from DOE national laboratories and other invited federally funded organizations. DOECGF has met annually since 1975. Originally founded to discuss topics in computer graphics, the DOECGF has continually updated the topics of interest, and now deals predominantly with the many facets of scientific visualization.

All of these forums provide an important place for exchange of information, ideas, and algorithms. But if your algorithm is still in the developmental stages and you don't happen to attend one of these meetings or talk with the right person, you may be reinventing the wheel.

Today, there is no formal mechanism in place for the exchange of software except for joint projects (e.g., ASCI). The attendees' consensus is that without a formal mechanism, interoperability will be difficult or impossible.

Most instances of interoperability that have been achieved to date have evolved out of interpersonal relationships of the people working on the projects. Only a few minutes of discussion at the workshop generated the following list of examples of this kind of interoperation between DOE sites:

HOW DID WE APPROACH THIS WORKSHOP?

The workshop attendees addressed three major topic areas. The first group, "Major Challenges," discussed the fundamental issues in research that are a barrier to or facilitate/require interoperability. The second group, "Software Interoperability," considered the engineering of software and data interoperability. The third group, "Communications," dealt with the non-technical barriers to interoperability and how can we break them down.

Major Challenges in Research

One of the questions that was asked during this workshop was: Are there fundamental issues in research that are a barrier to, or facilitate, or require interoperability? The working group that addressed this question first identified a set of specific research challenges, then considered them in the context of the above question.

RESEARCH CHALLENGES

There are a number of research challenges facing the DOE visualization community in meeting strategic goals. They include (but are not limited to):

RESEARCH AND BARRIERS TO INTEROPERABILITY

The circumstances in which research is done often discourage interoperability. When faced with limited research budgets and the pressures of delivering state-of-the-art work at a furious pace, software engineering issues, such as interoperability, inevitably take on a low priority -- unless they are a funded part of the work.

While the research process itself can be a fundamental barrier to interoperability, our goal was to address barriers to interoperability in the context of the specific research challenges we had previously identified. Research aside, we first identified certain technical barriers to interoperability, which are listed below in no specific order.

In Table 1, we capture some of the relationships between our research challenges and the barriers to interoperability which they expose. The research challenges in the column on the right are positioned across from barriers to which they relate in the column on the left. Many of the research challenges are related to multiple barriers. Not all relationships are shown.

A few thoughts deserve special mention:

Table 1: Relationships between barriers to interoperability and research challenges

Barriers

Research Challenges

Large/complex data

 

 

Large data & large data collections

Better discovery and filters for large data/collections

Performance

Large data & large data collections

Human-computer interface research

Interaction metaphors

Heterogeneity

  • Data Source
  • Data formats
  • Users
  • Equipment
  • Languages and programming models

 

Multi-source data and data fusion

Human-computer interface research

Interaction metaphors

Effective representation/display

Visual metaphors

Parallel software components

Rate of Change

  • Technology (hardware and software)
  • User's expectations
  • Scale
  • Local vs. global change

Parallel software components

Scalability

Human-computer interface research

Interaction metaphors

Visual metaphors

 

Diverse application domains

Semantics/terminology

Applications

Non-spatial data, information

Fundamental changes to visualization process/systems

Adaptive level of detail, multi-resolution

Out-of-core techniques

 

RESEARCH AREAS THAT FACILITATE INTEROPERABILITY

A few research areas would facilitate interoperability. In general, these areas involve the development of general frameworks for visualization and data analysis, and/or the development of special tools, which are designed for multi-use. More specifically, such work might include:

Different application domains have differing requirements. For example, one domain may emphasize quantification while another may emphasize qualitative discovery, or certain domains may rely on fundamentally different underlying data types. These differing requirements can result in domain-dependent frameworks and/or tools. It may still be possible to develop general-purpose frameworks and/or tools which provide enough flexibility and breadth of use to enable customized use by relatively broad communities.

RESEARCH AREAS THAT REQUIRE INTEROPERABILITY

There are certain research areas that mandate a certain level of interoperability. While it is possible to encourage interoperability as a design goal of all code development, some situations naturally require minimal levels of interoperability, at least within the closed system that is being implemented. With a modest additional emphasis on generalization, it is possible that applications in these areas could provide foundations that would contribute to farther-reaching interoperability.

Such research and/or application areas include:

Software Interoperability

This section presents a broad outline of the software engineering basis for sharing tools within the visualization community. The discussion includes three major topics: data file interoperability, sharing of software modules and code fragments, and remote sharing of visualization resources.

DATA FILE INTEROPERABILITY

A scientist or engineer who wants to use visualization tools often spends a great deal of time converting data from one format to another because of the wide variety of file formats and data models in use today.

Imagine being able to use a variety of visualization tools from different vendors and sources on a data file without having to perform any format conversion. In this scenario, the user benefits from access to new categories of tools, codes, and applications, as well as new functionality and high performance. Tool developers also benefit from the wider applicability and more widespread use of their tools.

The ASCI-funded Data Models and Formats (DMF) project is an effort to create this kind of data file interoperability. DMF is developing a common abstract data model and format that is portable across architectures, applications, tools, and scientific disciplines. The result of the DMF committee's work is the Vector Bundle Application Programming Interface (VB-API), based on the mathematical foundation of vector bundles. VB-API is designed to be general enough to support the gamut of scientific data model requirements, without the limitations of netCDF (Network Common Data Form), HDF (Hierarchical Data Format), and other data formats.

Prototype testing of a non-parallel VB-API has been ongoing since October 1997. A new parallel version will be released in May 1998. Two parallel ASCI applications will be tested with VB-API by the end of this month (April 1998), one at Sandia and one at Livermore. Summer 1998 will bring two further tests: exchange of vector bundle data between Livermore and Sandia physics codes without format conversion, and exchange of visualization tools, such as MeshTV, between applications that use VB-API. If these tests are successful, the DMF group will encourage the ASCI partners to adopt VB-API as widely as possible.

Non-ASCI partners, such as Ensight (a commercial visualization package) and DX (IBM's Visualization Data Explorer) could begin porting to VB-API in September 1998. The IBM Data Explorer team has already participated in several ASCI DMF meetings and has expressed a desire to collaborate in the DMF work. Other organizations expressing interest in VB-API include NCSA (because of HDF), the French Atomic Energy Commission (CEA), the British Atomic Weapons Establishment (AWE), the University of Illinois Pablo Research Group, and representatives of the petroleum industry. DOD and DOE researchers outside of ASCI will be encouraged to adopt VB-API.

Once the VB-API model has been validated within the ASCI community, it should be adopted at the grassroots level within the ER visualization community. The ER community should become involved with the ASCI DMF effort in order to communicate their technical needs and to keep up with progress in the testing of VB-API. During the transition to support the VB-API format, we recommend that many sample data files be made available to facilitate testing of applications and visualization tools. The best source for specifications for sample data files will be the DMF committee. Vendors should be lobbied to extend commercial products to support the VB-API model.

VB-API is not yet a proven technology. If it is successful, it will be a boon to the DOE research community. But there are potential risks. If performance is sluggish, researchers may have to generate two types of data -- interoperable and non-interoperable data. Poor performance would hinder widespread acceptance.

Even if VB-API performs well, its success as an interoperability tool will depend on a large-scale buy-in among the research community, which will require a significant marketing effort. To use VB-API, existing ER codes and applications will need to be extended to read and write the VB-API format. The cost of developing readers and writers will vary on a case-by-case basis. For example, the estimated cost of adding a VB-API reader to MeshTV (an interactive graphical analysis tool for visualizing and analyzing data on two- and three-dimensional meshes) is about one person-month, with an additional one-time cost of one person-month to learn the vector bundle model. Programmers will feel some resistance to spending so much time learning an unfamiliar abstract data model. And when users are able to experiment and evaluate new tools that were previously unavailable due to incompatible data formats, software developers may feel threatened by the new competition. In short, there will be many excuses for not adopting VB-API. The research community will have to be convinced of its benefits before they will buy in.

CODE SHARING

From a historical perspective, code sharing has been successful when some specific guidelines have been followed. Netlib, the repository of mathematical software at Oak Ridge and the University of Tennessee, has been very successful. The code, which is available at Netlib, consists primarily of subroutines with well-defined and documented interfaces. These subroutines are ready to be dropped into an application to solve some particular numerical problem. Another success story, at least for a period of time, was the International AVS Center (IAC), formerly in the Research Triangle Park but now at the University of Leeds in the UK. The IAC maintains a source code repository of modules for the AVS (Application Visualization System) framework. The modules can be compiled and then plugged into an AVS dataflow network. Data-typing issues are addressed by the data flow executive itself.

These two examples illustrate two types of code sharing: modules to be run within an established framework, and code fragments. One distinction between these two models is that a certain amount of development effort is required to integrate code fragments into a visualization tool or application. The application developer must usually write software that converts data from the application's format into a format suitable for the imported code. On the other hand, modules by definition have strongly typed data interfaces, so little, if any, software must be written to run an imported module within an established framework -- it should just drop into the data flow network and execute.

Sharing Modules within Established Frameworks

For modules that run within an established framework such as AVS, Khoros, Data Explorer, vtk (Visualization Toolkit), etc., low-level technical interoperability is a given. Typically, these tools do not need modification to perform adequately on serial machines with data of modest size. These are general-purpose tools that work well for a wide class of problems.

Among the established frameworks, there are a number of both commercially supported systems and freeware systems. The freeware systems are available over the Web and in source code form.

Commercially supported frameworks should have a long lifetime, since that is in the vendor's best interest, and generally provide good documentation and support. These frameworks have a large user base and a large knowledge base, so there is a high potential for exchanging tools.

What is needed to make that potential a reality is an effective distribution medium. To encourage use of visualization tool modules, we recommend that MICS sponsor a central visualization code and information repository similar to the DOE2000 ACTS Toolkit.

Sharing Code Fragments

Tools consisting of code fragments written for a specific architecture or a specific task serve valuable purposes in the visualization community. For example, commercial and off-the-shelf tools are not capable of processing today's increasing data sizes and often are not available for specialized high performance computing (HPC) architectures. Code fragment tools can bring new and previously unattainable computing and visualization capacity to users of specialized systems. Having source code for tools also facilitates debugging and quick enhancements. Often the user of a code fragment has a collegial relationship with the tool author and can expect a certain amount of informal support if needed.

Due to the variety of HPC architectures (e.g., vector, symmetric multiprocessing, distributed memory), different memory access strategies are required to achieve adequate performance. In scientific visualization, fast performance is crucial to human understanding because it allows interactivity, which improves the scientist's ability to manipulate, analyze, and understand data.

Despite the heterogeneity of code structure and methods in code fragment tools, a simple and direct software design and engineering philosophy could promote interoperability. Specifically, designing new software to read and write to the VB-API format would improve portability and save time for software integrators, who would no longer have to write data conversion software for each new imported tool. As time permits, existing code fragment tools should also be converted to use the VB-API model. But before this can happen, a significant technical development is needed -- the ability of VB-API to handle both in-core and out-of-core data. The present implementations of VB-API only support out-of-core data.

REMOTE SHARING OF VISUALIZATION RESOURCES

As the various computing centers have evolved, each center has tended to focus on a particular specialty. For example, one of the first teraflops machines is on line at Sandia Albuquerque, and specialized virtual reality facilities are available at Berkeley and Argonne. As time goes on, this trend will continue, and these "pockets of capability" will continue to grow in number and in capacity.

To maximize the value and use of these resources, our long-term vision is to establish a framework whereby a user located anywhere in cyberspace can make use of visualization hardware and software resources at any of these facilities, either singly or in combination. For example, imagine being able to use a teraflops machine to compute isosurfaces from a very large data set, then using a visualization server to create images of the isosurface, then having the images delivered to the desktop.

A research effort is needed to address the numerous technical challenges to achieving such a goal. Technical issues include scheduling and resource allocation, as well as incompatibility between vendor implementations of the CORBA (Common Object Request Broker Architecture) standard for distributed objects.

Shared resources would benefit many scientific programs by making visualization available to researchers who do not have a strong on-site visualization program, and by making the full range of visualization solutions to data interpretation problems available everywhere.

Communications

Interpersonal communications are fundamental to the effective dissemination of information within any technical society, including the DOE visualization community. For example, during the first ten minutes of the communications working group breakout session, two potential collaborations were discussed that would probably not have occurred without this session.

An effective communication structure to facilitate interoperability between DOE visualization centers would include four elements:

 

VISUALIZATION FACILITATOR

The visualization facilitator would be a central point of contact for the visualization community -- a gatherer, solicitor, and disseminator of useful information that is not readily available to the visualization community from a single source. The facilitator would also bring together members of the visualization community who have common interests or a possible symbiosis but who are unaware of each other's interests and resources.

The visualization facilitator would be an experienced and respected member of the visualization community who would represent the entire community rather than a particular lab or program. The facilitator would not be a policy shaper. To maintain close contact with the scientific and visualization communities, we recommend that the visualization facilitator be stationed at one of the DOE laboratories rather than in Washington, D.C.

The duties of the visualization facilitator would be:

Accomplishing these tasks will require substantial interaction between the facilitator and the visualization community across the country. Much of this contact can be done electronically, but significant travel to visualization sites and meetings will also be required, so an appropriate travel budget will be required. The facilitator should have United States citizenship to avoid security-related accessibility issues.

A successful facilitator would have the following capabilities:

To assess the usefulness of the visualization facilitator's efforts, success metrics could include:

EMAIL

The existing DOECGF mail list, doecgf@inel.gov, could be leveraged to promote interoperability. We recommend the following email activities:

  1. Q&A. Any question related to DOE visualization activities could be posted on the unmoderated mail list. The traffic on this list would be monitored, but not moderated, by the visualization facilitator so that important communication threads could be passed on to the group as a whole and inappropriate uses of the list could be eliminated.
  2. Unmoderated visualization news. Visualization activities, conferences, and accomplishments worldwide could be posted to the unmoderated list. The facilitator would include relevant news items from this source in the moderated mailings.
  3. Moderated visualization news. The facilitator would establish and maintain a moderated electronic mail list and regularly (monthly) compile and send a summary of information of interest to the entire community, including references to additional information.
  4. Archives. Searchable archives would be established and maintained for both the moderated and unmoderated email lists.

The first two items can be implemented immediately without additional funding as a simple expansion of the existing DOECGF email reflector. However, without a facilitator, the effectiveness of these activities would be somewhat limited, since there is currently little incentive to provide information to the list. A facilitator could generate enthusiasm for the first two activities and take on the additional workload required by the other two.

WEB SITE

The visualization community uses the World Wide Web daily to examine status information, identify contacts, locate software, and ask questions within each site's "intranet," or local electronic community. We don't realize the same level of value from the broader federal visualization community's Web sites. The reasons for this include uncertainty about the electronic location, timeliness, and accuracy of data. A source of comprehensive and reliable information would spur the development and use of interoperable visualization software tools.

We propose the creation of a federal visualization web site for visualization tool and application developers as well as managers and users of visualization programs. The contents of the site would include:

The software pointers would include software in various stages of development in order to provide advance access to software for inspection, testing, and suggestions that may improve interoperability.

A baseline Web site for federal visualization already exists and is recommended as a starting point -- www.persephone.inel.gov/DOEGCF -- which was established in 1992 by participants in the annual DOE Computer Graphics Forum. The value of this site includes its contact list, which has been maintained sporadically, and its links to software of relevance and interest over the past several years. Its use has been moderate to low for several reasons:

One of the principal roles of the visualization facilitator would be to promote and maintain this web site with current, relevant information. Ongoing activities would include:

INTELLECTUAL PROPERTY

A potential barrier to the effective exchange of information, whether it is paradigms, algorithms, or software, is the unnecessary encumbrance of licensing issues. Code sharing for research must become an established practice, while protecting the rights to license of the authors. A standard software license should be actively developed, based on models such as the GNU or VTK licenses, and instructions for its use should be given to the labs and the site offices.

Recommendations

Workshop participants had four recommendations for improving visualization interoperability, which are rank ordered. Additionally, many attendees expressed interest in having another meeting in six months to a year to assess the progress and results. This meeting might be coordinated with the 1999 meeting of the Computer Graphics Forum.

1. Observers to DMF

We recommend that several non-ASCI DOE sites participate in the next ASCI Data Model and Formats meeting, to be held May 7-8, 1998. (As a result of this workshop, one representative from each of ANL, BNL, INEL, LBNL/NERSC, ORNL, and PNNL has been invited to this meeting.)

2. Common abstract data model

A common abstract data model is fundamental to interoperability. Once VB-API is proven, we recommend that the national laboratories incorporate it into their visualization codes. To accelerate the adoption of this technology and demonstrate its usefulness to the scientific community, we recommend that DOE fund the following:

3. Facilitator

We recommend that DOE fund a senior visualization expert to be the full-time Visualization Facilitator to actively promote interoperability.

4. Interoperability research

We recommend that the DOE fund some visualization research efforts that promote interoperability by encouraging proposals that focus explicitly on interoperability or that emphasize interoperability in support of other research goals. Research focused explicitly on interoperability might include:

Research that addresses certain aspects of interoperability as part of a larger effort might involve:


Appendix A

Workshop on Interoperability of DOE Visualization Centers
Attendees

The following attendees contributed to this report:

DOE National Labs and Related Research Facilities:

Argonne National Laboratory

Terry Disz
Mike Papka

Brookhaven National Laboratory

Arnie Peskin
Gordon Smith

Idaho National Engineering and Environmental Laboratory

Eric Greenwade

Lawrence Berkeley National Lab

Wes Bethel
Nancy Johnston

Lawrence Livermore National Lab

Mark C. Miller
Dan Schikore
James Reus

Los Alamos National Laboratory

Jim Ahrens
James Painter

NASA-Ames

Sam Uselton

Oak Ridge National Laboratory

Nancy Grady
James Arthur Kohl
Ross Toedte

Pacific Northwest National Lab

Don Jones

Sandia National Laboratories

Jeff Jortner - Livermore
Constantine Pavlakos

Stanford Linear Accelerator Center

Joseph Perl

National Center for Supercomputing Applications

Polly Baker

ACTS Toolkit

Jim Mcgraw, Lawrence Livermore National Laboratory

Academia

Bill Ribarsky, Georgia Tech

Technical Editor

John Hules, Lawrence Berkeley National Laboratory