(1) The first aspect is how to parse and process xml documents. Following is a list of current popular xml processing interface.
Generally, implementation of DOM is to build a tree structure in memory. It means that only when the whole document is gained by the parser can the parser build dom tree. The obvious drawback is its performance considering a large xml document. What's more, sometimes the user just wants to get a small part of a large xml document. In this case, to build a in-memory tree is not a good solution.
" SAX is a streaming interface — applications receive information from XML documents in a continuous stream, with no backtracking or navigation allowed. This approach makes SAX extremely efficient, handing XML documents of nearly any size in linear time and near-constant memory, but it also places greater demands on the software developer's skills."
SAX is the best known example of Event-based APIs. "An event-based API, on the other hand, reports parsing events (such as the start and end of elements) directly to the application through callbacks, and does not usually build an internal tree. The application implements handlers to deal with the different events, much like handling events in a graphical user interface."
This mechanism requires programmers to understand concept similar to state-machine. Programmers should maintain a state machine conversion of different states is based on events SAX generates.
"These two access metaphors can be thought of as polar opposites. A tree based API allows unlimited, random, access and manipulation, while an event based API is a 'one shot' pass through the source document.
StAX was designed as a median between these two opposites. In the StAX metaphor, the programmatic entry point is a cursor that represents a point within the document. The application moves the cursor forward - 'pulling' the information from the parser as it needs. This is different from an event based API - such as SAX - which 'pushes' data to the application - requiring the application to maintain state between events as necessary to keep track of location within the document."
Here(http://www.xml.com/pub/a/2001/11/14/dom-sax.html?page=1) is a good discussion about DOM and SAX in high level other than detailed processing details.
(2) Validation of XML document.For APIs/Interfaces, they define implementation independent specification of XMl manipulation. They are just sets of interfaces and not implementations. Implementation of these interfaces are generally based on parsers which do the actual work. The implementation consists of a slim layer sitting on top of functionalities of parsers. It wraps functionalities of underlying parsers (may be Xerces, Crimson) to provide a unified interface which is defined by corresponding specification (JAXP or dom4j...). So generally after you download a library from website of dom4j or JDOM or JAXP, the library contains Xerces/dom4j lib. You can figure it out if you browse directory layout of the dom4j/JDOM/JAXP library.
One advanced idea of XML processing is to build correspondence between Java classes and XML Schema(may be DTD) and correspondence between Java objects and XML document. Then programmers just need to manipulate those Java objects instead of elements/attributes in XML documents.
There are many libraries which implement binding between XMl and Java, e.g. JIBX, XMLBeans, ADB. JCP constructed a standard called JAXB (Java API for XML Binding?). Glassfish provides a reference implementation. JAXME is also an implementation of JAXB. However JAXME have not published a new version since 2006. So I don't know whether it is still developed actively.
However, because correspondence between these two parts sometimes is not so natural, it may increase burden of programmers. Sometimes, the correspondence does not comply with what programmers expect. In this case, some human intervention is necessary. Then programmers must understand details of rules used during conversion. There may be many rules so that it is not a trivial task to grasp them all. Sometimes you don't need to understand them all. However, to figure out which rule you should customize is still not an easy task. As a result, some programmers prefer to manipulate XML using DOM/SAX instead of XML-Java binding.
Some useful projects which ease of development of web service in Java are created. Here(http://wiki.apache.org/ws/StackComparison) is a list of frameworks and some stack comparison is presented as well. One additional lib which is not mentioned in that article is XINS(http://xins.sourceforge.net/). For web service client, WSIF is a client framework which can make web service client be composed easily. It is donated by IBM to Apache foundation. However, it seems not to be actively developed now. I am not sure.
Every time variation of Java technologies exist, JCP is willing to standardize it. The web service area is not exceptional. Firstly, it proposed JAX-RPC specification. It standardized web service development based on WSDL/SOAP. One part inside the specification is binding between Java classes/objects and WSDL. In JAX-RPC specification, it contains detailed binding rules. After some time, JAX-RPC evolved into 2.0 and it is renamed to JAX-WS 2.0. The reason may be WS is a buzzword in industry and it may better capture the intent of the specification. In JAX-WS 2.0, WSDL-Java binding is delegated to JAXB. I remember JAX-WS implementation is included in JDK 6. JAX-RPC/JAX-WS specification describes interface by which programmers can easily build web service client and server programs. Besides, programmers can customized handling of transmissioned messages(may be SOAP) by plugging in handlers.
No comments:
Post a Comment