Access Keys:
Skip to content (Access Key - 0)

WS-Enumeration


WS-Enumeration 1.2 Developers Guide


Navigation
caGrid caGrid 1.2 Documentation
WS-Enumeration WS-Enumeration 1.2 Documentation WS-Enumeration 1.2 Developers Guide

CaGrid provides mechanisms for integration of the WS-Enumeration specification both singularly and as part of the Bulk Data Transfer support. This integration is accomplished with a variety of provided tools for both client and server side use, as well as a specialized service extension to the Introduce toolkit.

Relevant external links:


Overview


Enumeration Service Context

The WS-Enumeration spec makes the assumption that the service context which provides a method to begin an enumeration is also the same context which implements the other enumeration methods (Pull, Renew, Status, etc). In caGrid, the data to be enumerated is stored in a server side WSRF resource. This means the method to begin an enumeration must return both the spec required Enumeration Context and an Endpoint Reference (EPR) indicating the service context which will be responsible for handling the enumeration, as well as the resource key for the data. In caGrid, this combined response is known as an Enumeration Response Container.

When WS-Enumeration is utilized from within a Bulk Data Transfer resource, the BDT resource creates the enumeration resource and returns the Enumeration Response Container to encapsulate both the EPR of the new service context and resource, and the enumeration context object.

Extension Implementation

caGrid's support for WS-Enumeration is provided by a service extension to the Introduce toolkit. This extension adds the service context for WS-Enumeration operations, copies relevant WSDL files and schemas, and finally sets the Globus provided EnumProvider class as the implementation for the enumeration operations.

Using Enumeration in a Grid Service


To utilize WS-Enumeration in a grid service, two things are required. First, the service must be generated with the caGrid WS-Enumeration extension enabled. Next, one or more operations must be added to the grid service which return an Enumeration Response Container.

Such a "begin enumeration" method is required to begin the enumeration and hand off control of some data resource to the enumeration context and resource. The WS-Enumeration resource requires an implementation of the org.globus.ws.enumeration.EnumIteratorinterface be supplied to it at creation time. This interface is the means through which a data resource is exposed with the server side enumeration implementation. To simplify the process of exposing data via this interface, the caGrid enumeration implementation includes several utilities.

Enumeration Implementations

caGrid WS-Enumeration supplies the factory class gov.nih.nci.cagrid.wsenum.utils.EnumIteratorFactory which creates concrete instances of the EnumIterator. The factory method createIterator() takes parameters to determine the type of implementation, as well as the list of data objects to be enumerated (or a java.util.Iteratorto the same), the XML QName of those objects, and a WSDD configuration stream to manage the serialization of the data objects. The Java enumeration type gov.nih.nci.cagrid.wsenum.utils.IterImplType defines the five implementations and gives a brief description of each:

  • GLOBUS_SIMPLE
    • The Globus-provided simple enum iterator.
  • GLOBUS_INDEXED_FILE
    • The Globus-provided indexed file enum iterator.
  • CAGRID_SIMPLE
    • A simple iterator which persists objects to disk.
    • This iterator only respects the maxElements iteration constraint.
  • CAGRID_THREADED_COMPLETE
    • This iterator uses threads to respect maxTime constraints as well as respecting maxCharacters. Elements overflowing either of these constraints, however, are lost, and wait states for thread completion are not optimized.
  • CAGRID_CONCURRENT_COMPLETE
    • This iterator uses the Java 5 java.util.concurrentpackage to fully support the WS-Enumeration specification for an EnumIterator implementation. All iteration constraints are respected, and elements which cause maxCharacters to be exceded are queued for later retrieval.

For most purposes, the CAGRID_CONCURRENT_COMPLETE implementation should be used as provides full support for the server side WS-Enumeration spec. This is also the default implementation selected by the caGrid data services infrastructure. The other implementations are less complete, and may be useful in emulating the behavior of other enumeration-enabled systems.

Creating an Enumeration Response

Once an EnumIterator instance has been created, it must be sent to the enumeration resource, and an Enumeration Response Container returned to the user. The caGrid provided utility class gov.nih.nci.cagrid.wsenum.utils.EnumerateResponseFactory encapsulates this functionality in a simple static method. The method _createEnumerationResponse()_takes a single EnumIterator parameter and returns an Enumeration Response Container, which can immediately be returned to the client. This method encapsulates locating the EnumResourceHome, creating a new resource, setting its visibility, and deriving an endpoint reference (EPR) from the resulting resource key.

Example

Below is an example code snippet which uses the provided utilities to create an enumeration iterator over a list of Strings and return an appropriate enumeration response container:

// create a list of data values
List<String> values = new ArrayList<String>();
for (int i = 0; i < 10; i++) {                 values.add("String Value " + i);             }

// create an enum iterator
EnumIterator enumIter = EnumIteratorFactory.createIterator(
IterImplType.CAGRID_CONCURRENT_COMPLETE,
values,
new QName("http://www.w3.org/2001/XMLSchema", "string"),
null);

// formulate the response container and return
EnumerationResponseContainer response =
EnumerateResponseFactory.createEnumerationResponse(enumIter);
return response;

Client API


The Globus-provided org.globus.ws.enumeration.ClientEnumIterator API provides java.util.Iteratorabstraction for retrieving enumeration data and supports automatic data deserialization. The caGrid WS-Enumeration implementation provides a simplified factory interface to create a new instance of a Client Enum Iterator from an Enumeration Response Container.

For a more detailed discussion of the WS-Enumeration client tools, please see the developer's wiki regarding the WS Core WS-Enumeration.

Utilities

The caGrid-provided class gov.nih.nci.cagrid.wsenum.utils.EnumerationResponseHelpercontains static methods which can take an Enumeration Response Container and return a client enum iterator instance. The method createClientIterator takes only the response container, while another implementation of this method takes both the container and a java.io.InputStreamto the client-config.wsdd file. The information contained in this file will be used to deserialize results from the enumeration.

Examples

Basic caGrid WS-Enumeration

Given a service (here an enumeration-enabled Data Service) which supports enumeration, the following pattern may be used by a client to enumerate over data results.

TestEnumerationDataServiceClient client = new TestEnumerationDataServiceClient(args[1]);

EnumerationResponseContainer response = client.enumerationQuery(null);

ClientEnumIterator iter = EnumerationResponseHelper.createClientIterator(
response, TestEnumerationDataServiceClient.class.getResourceAsStream("client-config.wsdd"));

// optional use of iteration constraints to fetch 10 elements w/o regard
// to number of characters or maximum time to wait
IterationConstraints cons = new IterationConstraints(10, \-1, null);
iter.setIterationConstraints(cons);

while (iter.hasNext()) {
SOAPElement elem = (SOAPElement) iter.next();
if (elem&nbsp;\!= null) {           // handle the data element      }
// optionally change constraints to fetch 5 elements, max of
// 50000 characters, and a max wait time of 1 hr, 30 minutes
cons = new IterationConstraints(5, 50000,
new Duration(false, 0, 0, 0, 1, 30, 0));
iter.setIterationConstraints(cons);
}

The use of IterationConstraints to express how data should be fetched from the service is optional, and the default behavior is to retrieve one element at a time, with no regard for max characters or time constraints. Some server side implementations to not respect all iteration constraints. These constraints may change at any time during enumeration.

While iterating, it is possible that the call to 'hasNext' may return true, yet the call to 'next' return null, or even throw a NoSuchElementException. This is often the case when results are being dynamically populated on the server side at a rate slower than the client is capable of consuming them. This condition may also arise when the iteration constraints are such that no element could be returned which fulfills the constraints, yet the server is still holding more elements. Changing the constraints (allowing more time, increasing the maximum number of characters) may allow these elements to be retrieved.

Using BDT

The case of using the caGrid Bulk Data Transport infrastructure slightly complicates the client code to access data via enumeration by adding a level of indirection as the BDT resource is created first, and then used to create an enumeration resource.

TestBDTDataServiceClient client = new TestBDTDataServiceClient(args[1]);

// query with BDT
BulkDataHandlerReference bdtRef = client.bdtQuery(null);

// use the reference to create a generic BDT client
BulkDataHandlerClient bdtClient = new BulkDataHandlerClient(bdtRef.getEndpointReference());

// start the enumeration on the server
EnumerationResponseContainer response = bdtClient.createEnumeration();

// rest of code the same as basic WS-Enumeration case
Last edited by
Knowledge Center (1515 days ago) , ...
Adaptavist Theme Builder Powered by Atlassian Confluence