Alternative Interfaces to ANAG Data for Visualization

Thesis statement: One component of visualization is access to data. Extrapolating from a number of current trends (below), it will be useful to consider alternative interfaces to data that provide a growth path for future needs. The proposed model is not a replacement for current efforts (Boxlib-based) but is a supplement designed to accomodate needs unforeseen by the current design. The focus of this discussion is upon data access, presumeably for the purpose of visualization, as opposed to a framework for setting up computations.

Motivation/Background

Growth in the size of data is a fact. This fact will have a detrimental impact upon visualization tools that assume that "all" the data is resident on local primary storage.

With increasing volumes of data, there will be a corresponding increase in the importance of data "cataloguing", presentation of metadata and thumbnails for browsing. Stated differently, it should be possible to view data collections from a "high level", then transition into a more detailed view of a single data set. A reasonable analogy (modulo the single-computer vs. distributed computing issue) is the Encyclopedia Brittanica CD. This whole CD won't fit onto your hard drive (it will today, but wouldn't a few years ago). You could browse the CD looking at high level information, then when you found something interesting, you could drill down and get more details. It is straightforward to draw parallels between this model and one that could be implemented to act as a user interface to a set of visualization and data access tools.

Security is an issue of growing importance. How will data files be shared with others? Will you export your local filesystem to a system across the country? Put all shared data into an anonymous FTP location? A class library provides a set of services to manipulate, read and write data on a local machine, but doesn't address security or networked access.

Chombo is a complex system. The growth of this kind of system shows that same type of growing pains experienced by many others trying to transition from vector-based codes to distributed memory programming models. It is in flux, and likely will be for quite some time to come. A simplified, network centric interface that provides access to AMR data through Chombo components would be a useful thing.

Architecture

There are a number of potential ways to implement such an interface. For example, what protocol should be used? TCP/IP? HTTP? Reasonable approaches will probably involve an unpublished TCP/IP interface on a private port. Alternately, RPC services could be used to provide the framework for a client/server model. The term "web based interface" is a highly volatile term, however it should not be immediately dismissed. The security features built into browsers do have their advantages, and plug-in technology can be used to deploy (graphical) data browsers.

How would such an interface be used?

The vision is one of having a simple interface to a large collection of data that is scattered about. A "Chombo server" could, for example, be (reasonably) co-located with the remote data vault. A remote client could connect with that server, and first be presented with a high level view, or catalogue, of resident data. As the client "drills down", more detailed information is presented. The presented information might be thumbnail images of canned visualization on the data, or it might be the raw data itself that is then visualized on the client side (note: this kind of strategy is central to the approach outlined in our NGI proposals).

In support of the server-side (ie, on the "back end" of the protocol) is Chombo or some equivalent. The idea is that the complexity of the data access parts of Chombo are hidden from the casual data browsing user. On the client side could be just about anything; the visualization client doesn't need to know much about chombo, only that there are a bunch of grids and EB data.


Approach

Mon Jul 19 08:08:10 PDT 1999
 

Writing visualization tools that interface with Chombo is a difficult, if not impossible task. The problem is that the Chombo build and code environments are often incompatible with the build environments for visualization tools. For example, when we attempt to build an AVS module that includes calls into the Chombo class libraries, we encounter numerous problems.

Note that several visualization tools based upon vtk were developed. These tools include both Chombo and vtk class libraries. (Terry needs to insert a link to his vtk pages here.)

In pursuing a strategy that disconnects AMRv data readers from AMRv data visualization tools, we identify a two-stage development plan:

  1. AMRv to files.
    Standalone tools with command line interfaces read native AMRV and EB data files, writing package-neutral formatted files for use by postprocessing visualization tools.
  2. AMRv to network transport.
    A Chombo "data server" tool will listen to a network interface, such as a socket, and provide the ability for a remote client to query and retrieve AMRv data. This server provides information about a single data file, but may provide access to subsets of the data to the client.
  3. Database browser
    As the number of AMRv datasets grows, a natural growth path to this approach will be to provide access to more than a single data file.

Characteristics of AMRv Data

The Box is the fundmantal unit on the AMRv data sets. Boxes may exist in either 2D or 3D. Data, such as density, pressure, and so forth live at the box level.

Collections of boxes are contained in a FAB. A FAB is a collections of boxes defined by an origin, a box step size (the origin and step size define the size for all boxes at each FAB), and the number of boxes in a 2D or 3D grid. Note that boxes within a FAB are homogeneous in size.

A MultiFAB is a collection of FAB objects at a specified grid refinement level.

An AMRv dataset is a collection of MultiFab objects.

Visualization Fundamentals

Based upon discussions with ANAG, we can identify a number of fundamental visualization tasks that can be applied to "boxes." These include:

These are simply visualization techniques. User interface, implementation environment, deployment environments and so forth are not specified.