APIFinder
   The essential directory of application programming interfaces
Submit an API
DevX
Function
Language/OS
Contribute
About
Browse DevX
advertisement
Log In | Register
Manipulate XML Content the Ximple Way
For many common use cases, you can improve your XML-processing performance by taking advantage of XML-VTD's document-centric processing model. 
February 11, 2008

advertisement
The latest Java version of the Virtual Token Descriptor for XML (VTD-XML) can function as a slicer, an editor, and an incremental modifier to intelligently manipulate XML document content. This article will show you how to use it, introduce you to the concept of "document-centric" XML processing and discuss its implications for service-oriented architecture (SOA) and the future of enterprise IT.

A previous article on DevX presented VTD-XML as a general-purpose, ultra high-performance XML parser well-suited for processing large XML documents using XPath. In parsing mode, VTD-XML derives its memory efficiency and high performance from non-extractive parsing. Internally, VTD-XML retains the XML document intact in memory and un-decoded, using offsets and lengths to describe tokens in the XML document. By resorting entirely to primitive data types (such as 64-bit integers), VTD-XML achieves unrivaled performance and memory efficiency by eliminating unnecessary object creation and garbage collection costs (which are largely responsible for the poor performance of DOM and SAX parsing).

Nevertheless, memory usage and CPU efficiency may be only a small part of the inherent benefits that non-extractive parsing offers. An arguably more significant implication—one that sets it apart from other XML parsing techniques—lies in its unique ability to manipulate XML document content at the byte level. Below are three distinct, yet related, sets of capabilities available in version 2.2 of VTD-XML.

  • XML slicer—You can use a pair of integers (offset and length) to address a segment of XML content so your application can slice the segment from the original document and move it to another location in the same or a different document. The VTDNav class exposes two methods that allow you to address an element fragment: getElementFragment(), which returns a 64-bit integer representing the offset and length value of the current element, and getElementFragmentNs() (in the latest version), which returns an ElementFragmentNs object representing a "namespace-compensated" element fragment (more detail on this later).
  • Incremental XML modifier—You can modify an XML document incrementally through the XMLModifier, which defines three types of "modify" operations: inserting new content into any location (at any offset) in the document, deleting content (by specifying the offset and length), and replacing old content with new content—which effectively is a deletion and insertion at the same location. To compose a new document containing all the changes, you need to call the XMLModifier's output(...) method.
  • XML editor—You can directly edit the in-memory copy of the XML text using VTDNav's overWrite(...) method, provided that the original tokens you're overwriting are wide enough to hold the new byte content.
Editor VS Incremental Modifier
While non-extractive parsing enables both the editing mode and the incremental modifier mode of VTD-XML, there are subtle differences between the two. Using VTD-XML as an incremental modifier (by calling various XMLModifier methods) doesn't modify the in-memory copy of the XML document; instead, you compose a new document based on the original document and the operations you specify. To generate the new document, you must call the XMLModifier's output(...) method.

In contrast, when using VTD-XML as an editor, you directly modify the in-memory XML text. In other words, if the modification is successful, your application logic can immediately access the new data—there's no need to reparse.

Consider the following XML document named test.xml:

 <root attr="old value 123"/>
To change the attribute value of "attr" to "new value," you can use the following Java code:

 import com.ximpleware.*;
public class changeAttrVal{
public static void main(String args[]) throws Exception{
VTDGen vg = new VTDGen();
XMLModifier xm = new XMLModifier();
if (vg.parseFile("test.xml",false)){
VTDNav vn = vg.getNav();
xm.bind(vn);
int i = vn.getAttrVal("attr");
if (i!=-1)
xm.updateToken(i,"new value");
xm.output("new_test.xml");
}
}
}
The last line of the preceding code outputs the modified XML document with the changed attribute value to the file new_text.xml, as shown below:

 <root attr="new value"/>
You could achieve the same result using the VTD-XML's editing mode using this Java code:

 import com.ximpleware.*;
import java.io.*;
public class changeAttrVal2{
public static void main(String args[])
throws Exception{
VTDGen vg = new VTDGen();
if (vg.parseFile("test.xml",false)){
VTDNav vn = vg.getNav();
int i = vn.getAttrVal("attr");
if (i != -1){
vn.overWrite(i, "new value".getBytes());
//print the new string here
System.out.println(
"print the new attr value ===> " +
vn.toString(i));
}
FileOutputStream fos = new FileOutputStream("new_test2.xml");
fos.write(vn.getXML().getBytes());
fos.close();
}
}
}
In contrast to the output from XMLModifier, this version retains a few extra white spaces as a part of the attribute value. This is because VTDNav's overWrite() method first fills the "window" (the space occupied by the content) of the attribute value with the new byte content, then fills the remaining part of the window with white spaces, guaranteeing that the new token has the same length as the old token in the new XML file. However, note that the example can immediately print out the new attribute value after calling overWrite(), without generating a new copy of the document:

 <root attr="new value "/>

Continue reading article on DevX.com...

Jimmy Zhang is CEO of XimpleWare. He is also a well-known author and active in open source movement. He can be reached at jzhang@ximpleware.com


test api 1
test api 1test api 1test api 1tes
Provider: testprovider   Cost:




JupiterOnlineMedia

internet.comearthweb.comDevx.commediabistro.comGraphics.com

Search:

Jupitermedia Corporation has two divisions: Jupiterimages and JupiterOnlineMedia

Jupitermedia Corporate Info


Legal Notices, Licensing, Reprints, & Permissions, Privacy Policy.

Advertise | Newsletters | Tech Jobs | Shopping | E-mail Offers

Solutions
Whitepapers and eBooks
IBM eBook: Planning a Service Oriented Architecture
IBM eBook: Choosing the Right Architecture--What It Means for You and Your Business
Microsoft Article: Will Hyper-V Make VMware This Decade's Netscape?
Avaya Article: Using Intelligent Presence to Create Smarter Business Applications
Intel Go Parallel Article: Getting Started with TBB on Windows
Microsoft Article: 7.0, Microsoft's Lucky Version?
Avaya Article: How to Feed Data into the Avaya Event Processor
IBM Article: Developing a Software Policy for Your Organization
Microsoft Article: Managing Virtual Machines with Microsoft System Center
Intel Go Parallel Article: Intel Threading Tools and OpenMP
HP eBook: Storage Networking , Part 1
Microsoft Article: Solving Data Center Complexity with Microsoft System Center Configuration Manager 2007
MORE WHITEPAPERS, EBOOKS, AND ARTICLES
Webcasts
HP Video: StorageWorks EVA4400 and Oracle
HP Webcast: Storage Is Changing Fast - Be Ready or Be Left Behind
Microsoft Silverlight Video: Creating Fading Controls with Expression Design and Expression Blend 2
MORE WEBCASTS, PODCASTS, AND VIDEOS
Downloads and eKits
Red Gate Download: SQL Toolbelt and free High-Performance SQL Code eBook
Iron Speed Designer Application Generator
MORE DOWNLOADS, EKITS, AND FREE TRIALS
Tutorials and Demos
Silverlight 2 App and Walkthrough: Leverage Silverlight 2 with SQL Server and XML
IBM Article: Enterprise Search--Do You Know What's Out There?
HP Demo: StorageWorks EVA4400
Microsoft Article: The Progress and Promise of Deep Zoom
Microsoft How-to Article: Get Going with Silverlight and Windows Live
MORE TUTORIALS, DEMOS AND STEP-BY-STEP GUIDES