Update proposal 2008 08 17

From XdmfWeb
Revision as of 11:59, 21 August 2009 by Dave.demarle (talk | contribs) (update with specifics now that I've actually started working on it)
Jump to navigationJump to search

In Early 2007, XDMF underwent a substantial revision. What was the original library we now call XDMF1, and the new version is called XDMF2. The move was enabled by a switch to the libxml2 xml parsing library. Changes to XDMF included:

 1. Replace expat + Xdmf functions with libxml2
 2. Support XPath
 3. XdmfStructure + XdmfTrasform have been merged into XdmfDataItem
 4. DataItem support hierarchical structure
 5. Grids  support hierarchical structure
 6. Added support for Quadratic elements
 7. Added support for mixed topologies
 8. Added XdmfInformation - Application specific data
 9. Added Writing capabilities to Xmdf lib. The state of the DOM is
      accurately reflected in the XML
 10. All XDMF XML Elements now have an XdmfObject representation
 11. Adding DSM
 12. Update from hdf 1.6.2 to hdf 1.6.5

At the same time VTK's (and thus ParaView's) interface to XDMF was partially updated to use the new library. That is, vtkXDMFReader was upgraded to recognize the XDMF2's new syntax (see 4 and 5 above) and directly produce composite data sets instead of multiple outputs. Meanwhile, vtkXDMFWriter was only upgraded enough to use XDMF2 in a backwards compatibility mode. The writer does not understand vtkCompositeDataSet types and still expects to be given complex data in the form of multiple inputs - a legacy of the old ParaView multi-part architecture. Additionally the writer is restricted to a few input dataset structures, because it does not make use of XDMF2's support for mixed topologies. Work is now underway to upgrade the vtk writer, to bring it up to date with the library. See below.

After the initial upgrade to XDMF2, important new features have been added to XDMF. Although these features exist in the library, not all of them are useable in the vtkXDMFReader, and even fewer are useable in the vtkXDMFWriter. The new features include:

temporal support - Jerry and Biddiscombe (Early 2008)
additional cell types - CSimsoft (Early 2008)
ghost levels - Chris Kees (May 2008)
sql heavy data - Jerry (?)
compression - Ian Curlington (May 2008)
parallel "strategies" - Will Dicharry (Jun 2008)
in place row/major column reordering - Dominik Szczerba (Jun 2008)
better hdf version independence - Jerry (Jun 2008)

A survey of traffic on the XDMF mailing list shows that there is demand for the following additional features:

 unit annotations
 tabular data (info vis)
 out of core row/major reordering
 SIL interface to composite Data
 static geometry and topology time varying data optimizations
 wildcard specification

Finally, I personally think that a full suite of web resources for XDMF would be beneficial to the community at large. These would include bug tracker, doxygen documentation, and regression testing. The most urgent is the need for comprehensive regression testing. The tests should cover at least the important configurations for XDMF, and the fundamental data types. The tests would exist to cover the libraries important features, to ensure that they continue to work as developers improve the library, and also to provide a set of working examples for new users. The configurations of the library that could be tested are the combinations of the following options:

 mpi
 vtk
 system hdf5
 system libxml2
 system zlib
 python
 run from install/run from build directory

However only a handful of the combinations need to be tested. For example, it appears that python=ON and MPI=OFF do not compile today.


Writer

Objectives:

produce hierachical data items when given composite data object inputs
produce full set of data types that reader recognizes
add support for time
parallel efficiency

Architecture:

The plan for upgrading the writer is to introduce a new writer class, and to keep the the existing writer functional, until is can be deprecated and removed. The new writer will be written with these two objectives. First, it is desireable to minimize in memory copies or raw data, using pointer sharing as much as possible to reduce memory requirements for large data processing. Second, it is desireable to leave as much work to the XDMF library itself as possible. The existing writer was written before the xdmf2, and thus manually writes out strings to produce XML elements, because XDMF1 did not have support for writing (see 9 above).

The writer then, will be responsible for examining its input vtkDataObject (note, not plural, we will not support writing multiple objects simultaneously), traversing the tree structure if that object is a composite one, and at each node in the tree, making calls the the XDMF library to convert the in memory VTK data structures into XDMF elements, where each element is configured appropriately and given access to the in memory data array addresses.

The initial design will not support parallelism, or temporal support. Once the static and serial code is proven these features will be added. Temporal support will simply involve obtaining the time domain from the input, and then if the input is time varying, beginning the XDMF structure with a temporal collection.

Parallel support can mean several things. In parallel, each processor may produce its own file, or all might produce a single file either synchronously or in a round robin fasion. Additionally the data or the xml elements for the file could be shared or transfered between processors. The optimal strategy depends upon the data and platform, most importantly whether the filesystem is independent, networked or parallel filesystem. Thus we may support one or more of these parallel options, and the choice will be made after the writer is functioning properly.