Access Keys:
Skip to content (Access Key - 0)

Identifiers


Developers Guide


Contents

Identifier Metadata


caGrid provides a framework for globally identifying objects in the grid. The identifier is a globally unique name for the data-object that can be unambiguously used to refer to the data from different application contexts.

Metadata is information that can be attached to the identifier. It is any information that describes the object being identified. Typically, it would also be information that can be used to locate and/or retrieve the target data object.

When a deployment of identifiers is being planned, it must be decided what the metadata will be.

A typical example is the identification of data objects accessible by a caGrid data service. The framework's identifiers-client project has a built-in profile that enables the retrieval of such objects. This profile requires the existence of a CQL query string and an End Point Reference in the identifier metadata.

Metadata is represented in the framework in the form of key/value pairs, in which the key names the piece of relevant metadata and the value is that associated with the metadata key. For example:

Metadata Key
Metadata Value
EPR
<ns1:EndpointRerefence...>
CQL
<CQLQuery...>

Naming Authority Grid API


The identifiers framework provides a standard analytical grid service. This API enables the creation and maintenance of identifiers.

Exceptions

The following exceptions can be thrown by one or more methods described in the sections that follow.

  • NamingAuthorityConfigurationFault: The target naming authority is not running correctly. A configuration issue exists.
  • InvalidIdentifierFault: The provided identifier does not exist.
  • NamingAuthoritySecurityFault: The requesting user (grid identity) is not authorized to perform the requested operation.
  • InvalidIdentifierValuesFault: The provided metadata is invalid (e.g., a non-null key array with empty key strings).

Identifier Methods

You can use the following methods within the identifiers framework.

  • createIdentifier
  • resolveIdentifier
  • deleteKeys
  • createKeys
  • replaceKeyValues
  • getKeyNames
  • getKeyData

The following sections describe these methods.

createIdentifier

URI createIdentifier(IdentifierData);

This method is used to create an identifier. Input metadata (IdentifierData) is optional. Metadata can also be added to the identifier later using other available methods. The output is the newly created identifier URI.

Exceptions:

  • NamingAuthorityConfigurationFault
  • InvalidIdentifierFault
  • NamingAuthoritySecurityFault
  • InvalidIdentifierValuesFault

Example:

import namingauthority.IdentifierData;
import namingauthority.KeyData;
import namingauthority.KeyNameData;

IdentifiersNAServiceClient client = new IdentifiersNAServiceClient("http://identifiers-na.nci.nih.gov/wsrf/services/cagrid/IdentifiersNAService");

String[] keys = new String[] {"SYMBOLS", "QUOTES" };
String[][] values = new String[][] {
			{"MSFT", "AAPL"},
			{"http://finance.yahoo.com/q?s=MSFT","http://finance.yahoo.com/q?s=AAPL"}
			};

KeyNameData[] kvs = new KeyNameData[ keys.length ];
for(int i=0; i< keys.length; i++) {
	KeyData kd = new KeyData();
	kd.setValue(values[i]);
	kvs[i] = new KeyNameData(kd, keys[i]);
}

IdentifierData id = new IdentifierData(kvs);
org.apache.axis.types.URI identifier = client.createIdentifier(id);

System.out.println("Identifier: " + identifier.toString());

resolveIdentifier

IdentifierData resolveIdentifier(URI);

This method accepts an identifier and returns the associated metadata.

Exceptions:

  • NamingAuthorityConfigurationFault
  • InvalidIdentifierFault
  • NamingAuthoritySecurityFault

Example:

import namingauthority.IdentifierData;
import namingauthority.KeyData;
import namingauthority.KeyNameData;

URI identifier = new org.apache.axis.types.URI(
		"http://identifiers-pa.nci.nih.gov/production/847a79cf-0ce6-482a-8a5a-ed95ecb33947");

IdentifiersNAServiceClient client = new IdentifiersNAServiceClient(
		"http://identifiers-na.nci.nih.gov/wsrf/services/cagrid/IdentifiersNAService");
		
IdentifierData metadata = client.resolveIdentifier(identifier);

for( KeyNameData kv : metadata.getKeyNameData()) {
	System.out.println("\n**************\nKEY: " + kv.getKeyName());
	KeyData kd = kv.getKeyData();
	if (kd != null && kd.getValue() != null){
		for(String value : kd.getValue()){
			System.out.println("\t\tVALUE: " + value);
		}
	}
}

deleteKeys

void deleteKeys(URI identifier, String[] keyNames);

This method accepts an identifier and a list of metadata key names. It deletes the specified key names from the identifier metadata.

Exceptions:

  • NamingAuthorityConfigurationFault
  • InvalidIdentifierFault
  • NamingAuthoritySecurityFault
  • InvalidIdentifierValuesFault
    • No keys were provided
    • One or more of the specified keys does not exist

Example:

URI identifier = new org.apache.axis.types.URI(
		"http://identifiers-pa.nci.nih.gov/production/847a79cf-0ce6-482a-8a5a-ed95ecb33947");

String[] keyList = new String[] { "SYMBOLS", "QUOTES" };

IdentifiersNAServiceClient client = new IdentifiersNAServiceClient(
		"http://identifiers-na.nci.nih.gov/wsrf/services/cagrid/IdentifiersNAService");

client.deleteKeys(identifier, keyList);

createKeys

void createKeys(URI, IdentifierData);

This method is used to add new metadata keys (and their associated values) to an existing identifier. It accepts an identifier URI and the IdentifierData structure containing the new keys and data to be added to the provided identifiers.

Exceptions:

  • NamingAuthorityConfigurationFault
  • InvalidIdentifierFault
  • NamingAuthoritySecurityFault
  • InvalidIdentifierValuesFault
    • No keys were provided.
    • A key with the provided name already exists

Example:

import namingauthority.IdentifierData;
import namingauthority.KeyData;
import namingauthority.KeyNameData;

URI identifier = new URI(
		"http://identifiers-pa.osu-citih.org/production/7e82e853-c972-4d63-a891-cbe0260316c2");

String keyName = "UFO_SEARCH_URLS";

String keyValues = new String[]{
		"http://www.google.com/search?q=UFO",
		"http://www.bing.com/search?q=UFO"
		};
		
KeyData kd = new KeyData();
kd.setValue(keyValues);
KeyNameData knd = new KeyNameData(kd, keyName);

IdentifiersNAServiceClient client = new IdentifiersNAServiceClient(
		"http://identifiers-na.nci.nih.gov/wsrf/services/cagrid/IdentifiersNAService");

client.createKeys(identifier,new IdentifierData(new KeyNameData[]{ knd }));

replaceKeyValues

void replaceKeyValues(URI, IdentifierValues);

This method is used to replace the values currently assigned to the specified keys with a new set of values. Old previous values are discarded. It accepts the identifier URI and the new data.

Exceptions:

  • NamingAuthorityConfigurationFault
  • InvalidIdentifierFault
  • NamingAuthoritySecurityFault
  • InvalidIdentifierValuesFault
    • No keys were provided.
    • One or more of the specified keys does not exist.

Example:

import namingauthority.IdentifierValues;
import namingauthority.KeyNameValues;
import namingauthority.KeyValues;

URI identifier = new URI(
	"http://identifiers-pa.osu-citih.org/production/7e82e853-c972-4d63-a891-cbe0260316c2");

String keyName = "UFO_SEARCH_URLS";

String keyValues = new String[] {
	"http://www.google.com/search?q=UFO",
	"http://search.yahoo.com/search?p=UFO"
	};
	
KeyNameValues[] newKeyValues = new KeyNameValues[1];
newKeyValues[0] = new KeyNameValues();
newKeyValues[0].setKeyName(keyName);
newKeyValues[0].setKeyValues(new KeyValues(keyValues));

IdentifiersNAServiceClient client = new IdentifiersNAServiceClient(
	"http://identifiers-na.nci.nih.gov/wsrf/services/cagrid/IdentifiersNAService");
	
client.replaceKeyValues(identifier, new IdentifierValues(newKeyValues));

getKeyNames

String[] getKeyNames(URI);

This method is used to retrieve the metadata key names associated with the provided input identifier. The values are not returned.

Exceptions:

  • NamingAuthorityConfigurationFault
  • InvalidIdentifierFault
  • NamingAuthoritySecurityFault

Example:

URI identifier = new URI(
	"http://identifiers-pa.osu-citih.org/production/7e82e853-c972-4d63-a891-cbe0260316c2");

IdentifiersNAServiceClient client = new IdentifiersNAServiceClient(
	"http://identifiers-na.nci.nih.gov/wsrf/services/cagrid/IdentifiersNAService");
	
String[] keyNames = client.getKeyNames(identifier);

System.out.println("Identifier " +  identifier.toString()
	+ " has the following metadata keys:");
	
for( String keyName : keyNames ) {
	System.out.println(keyName);
}

getKeyData

KeyNameData getKeyData(URI identifier, String keyName);

This method takes an identifier and a metadata key name and returns the associated metadata value.

Exceptions:

  • NamingAuthorityConfigurationFault
  • InvalidIdentifierFault
  • NamingAuthoritySecurityFault
  • InvalidIdentifierValuesFault
    • The key name specified does not exist.

Example:

import namingauthority.KeyNameData;

URI identifier = new URI(
	"http://identifiers-pa.osu-citih.org/production/7e82e853-c972-4d63-a891-cbe0260316c2");

IdentifiersNAServiceClient client = new IdentifiersNAServiceClient(
	"http://identifiers-na.nci.nih.gov/wsrf/services/cagrid/IdentifiersNAService");

KeyNameData knd = client.getKeyData(identifier, "UFO_SEARCH_URLS");

System.out.println(knd.getKeyName());
for( String value : knd.getKeyData().getValue() ){
	System.out.println(value);
}

Naming Authority Web Interface


The identifiers framework deploys a web application, whose main purpose is to enable resolution of identifiers via HTTP. When an identifier URI is followed (e.g., entered into a web browser), the web application resolves the identifier and returns the corresponding metadata.

Home Page and HTML Responses

When the web application detects a web browser, or any client that includes "text/html" in the ACCEPT HTTP request header, the response is prepared in HTML format.

For example, entering an identifier such as "http://identifiers-pa.osu-citih.org/production/91eb948f-9f16-40e5-8297-3752621d7931" in a web browser produces the following output:

"Firefox browser window showing a page with a URL to the same identifier noted in the text immediately before the image

The naming authority also provides a simple home page that can be used to enter a local identifier for resolution. Simply remove the trailing NamingAuthorityService from the naming authority web application end point (e.g., "https://identifiers-na.osu-citih.org/namingauthority"):

A caGrid identifiers Naming Authority window showing a local identifier of 91eb948f-9f16-40e5-8297-3752621d7931 and an arrow highlighting the Resolve Identifier button.

Enter a local identifier and click Resolve Identifier.

XML Responses

When an HTTP client sets the Accept HTTP request header to "application/xml", the response is returned in XML format.

For example, resolving an identifier such as "http://identifiers-pa.osu-citih.org/production/91eb948f-9f16-40e5-8297-3752621d7931" would produce a response like:

<?xml version="1.0" encoding="UTF-8"?>
<na:IdentifierData xmlns:na="http://na.cagrid.org/1.0/NamingAuthority"
	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xsi:schemaLocation="http://na.cagrid.org/1.0/NamingAuthority
		https://identifiers-na.osu-citih.org/namingauthority/org.cagrid.identifiers.namingauthority.xsd">
	<na:KeyNameData>
		<na:KeyName>URLS</na:KeyName>
		<na:KeyData>
			<na:value>https://www.google.com</na:value>
		</na:KeyData>
	</na:KeyNameData>
</na:IdentifierData>

The identifiers client project, which will be covered later, leverages this functionality and provides utility methods that can be used to resolve identifiers via HTTP and convert the response to java objects.

The XML response conforms to the schema org.cagrid.identifiers.namingauthority.xsd, which is available from the naming authority project, and can also be downloaded from the naming authority web application (e.g., "https://identifiers-na.nci.nih.gov/namingauthority/org.cagrid.identifiers.namingauthority.xsd").

The naming authority web application supports an alternative way for clients to resolve identifiers to XML by adding "?xml" to the identifier URI. This is useful when the ACCEPT HTTP header cannot be set for some reason (e.g., a human using a web browser wishing to see XML, instead of HTML).

For example, entering an identifier such as "http://identifiers-pa.osu-citih.org/production/91eb948f-9f16-40e5-8297-3752621d7931?xml" in a web browser produces the following output:

Retrieving Naming Authority Configuration

The naming authority web application makes available some of its configuration settings via HTTP. Add "?config" to any identifier URI or to the web application endpoint. For example:

  • "http://identifiers-pa.nci.nih.gov/production/7e82e853-c972-4d63-a891-cbe0260316c2?config"
  • "https://identifiers-na.nci.nih.gov/namingauthority/NamingAuthorityService/?config"

Alternatively, the See Naming Authority Configuration button can be used on the naming authority home page.

The caGrid Identifiers Naming Authority window. Instructions on page say: "This site hosts a naming authority for identifiers resolution. Please enter a local identifier and click the Resolve Identifier button." The See Naming Authority Configuration button is below the box for entering the local identifier.

And the response:

"A FireFox window showing URLs for the following three items: Naming Authority

As before, client programs can request XML by setting the ACCEPT request header to application/xml.

<?xml version="1.0" encoding="UTF-8"?>
<na:Configuration xmlns:na="http://na.cagrid.org/1.0/NamingAuthority"
	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xsi:schemaLocation="http://na.cagrid.org/1.0/NamingAuthority
		https://identifiers-na.nci.nih.gov/namingauthority/org.cagrid.identifiers.namingauthority.xsd">
	<na:naGridSvsURI>https://identifiers-na.nci.nih.gov:8443/wsrf/services/cagrid/IdentifiersNAService</na:naGridSvcURI>
	<na:naPrefixURI>http://identifiers-pa.nci.nih.gov/production</na:naPrefixURI>
	<na:naBaseURI>https://identifiers-na.nci.nih.gov/namingauthority/NamingAuthorityService/</na:naBaseURI>
</na:Configuration>

The identifiers client project, which will also be covered later, leverages this functionality to discover the grid service endpoint for a given identifier URI.

The XML response conforms to the schema org.cagrid.identifiers.namingauthority.xsd, which is available from the naming authority project and can also be downloaded from the naming authority web application (e.g., "https://identifiers-na.nci.nih.gov/namingauthority/org.cagrid.identifiers.namingauthority.xsd").

Client


The identifiers framework includes an identifiers-client project that serves two purposes:

  • Aids users in resolving identifiers using both the http and grid interface.
  • Provides an extensible framework to aid in retrieving data objects using the identifier metadata resulting from the resolution process. This is accomplished by plugging in retrieval profiles into the framework.

Resolution

The Resolver class provides utility methods to resolve identifiers, hiding some of the complexity present in the underlying framework. For example, when the resolution via HTTP is used, Resolver automatically de-serializes the XML returned by the naming authority into naming authority domain Java objects that can then be used the API client.

resolveHttp

IdentifierData resolveHttp( URI identifier );

This method resolves the input identifier and returns the corresponding metadata. Since identifiers are live URLs, they are simply followed (http get) to retrieve their metadata in XML format from the naming authority. The XML is then de-serialized into naming authority domain objects (IdentifierData).

Exceptions:

  • HttpException
    • Unexpected HTTP error was encountered
  • NamingAuthorityConfigurationException
    • Naming authority reports a configuration error.
    • Failed to de-serialize naming authority response.
  • NamingAuthoritySecurityException
    • The identifier can not be resolved by anonymous users.
  • InvalidIdentifierException
    • The identifier does not exist.

Example:

import org.cagrid.identifiers.namingauthority.domain.IdentifierData;
import org.cagrid.identifiers.namingauthority.domain.KeyData;
import org.cagrid.identifiers.resolve.Resolver;

IdentifierData metadata = new Resolver().resolveHttp(identifier);

for(String key : metadata.getKeys()) {
	KeyData data = metadata.getValues(key);
	
	System.out.println(key);
	System.out.println(data.getPolicyIdentifier());
	for(String value : data.getValues()){
		System.out.println(value);
	}
}

Since initializing a Resolver object is expensive, it is recommended to create the object only once when multiple identifiers are going to be resolved.

resolveHttp (GSI)

IdentifierData resolveHttp( URI identifier, GlobusCredential credentials );

This method resolves the input identifier using the provided Globus credentials and returns the corresponding metadata. This is useful when the naming authority is running a secure deployment where identifiers may not be viewed by everyone.

The implementation uses the Globus Grid Security Infrastructure (GSI) API (GSIHttpURLConnection) to target the naming authority directly. The naming authority endpoint is discovered by first retrieving the naming authority configuration using the input identifier URI, as explained in the Retrieving Naming Authority Configuration section. The reason for this is that the GSI API does not support HTTP redirects, which are used by our deployment with a prefix authority (PURL).

As before, the response XML is de-serialized into naming authority domain objects (IdentifierData).

Exceptions:

  • HttpException
    • Unexpected HTTP error was encountered.
  • NamingAuthorityConfigurationException
    • Naming authority reports a configuration error.
    • Failed to de-serialize naming authority response.
  • NamingAuthoritySecurityException
    • The identifier can not be resolved by anonymous users.
  • InvalidIdentifierException
    • The identifier does not exist.

Example:

import org.cagrid.identifiers.namingauthority.domain.IdentifierData;
import org.cagrid.identifiers.namingauthority.domain.KeyData;
import org.cagrid.identifiers.resolver.Resolver;

IdentifierData metadata = new Resolver().resolveHttp(identifier, credential);

for(String key : metadata.getKeys()) {
	KeyData data = metadata.getValues(key);
	
	System.out.println(key);
	System.out.println(data.getPolicyIdentifier());
	for(String value : data.getValues()){
		System.out.println(value);
	}
}

resolveGrid

IdentifierData resolveGrid( URI identifier );

This method resolves the input identifier using the naming authority grid service interface (resolveIdentifier) and returns the corresponding metadata. The grid service endpoint is discovered by first retrieving the naming authority configuration using the input identifier URI, as explained in "Retrieving Naming Authority Configuration" section.

Exceptions:

  • HttpException
    • Unexpected HTTP error was encountered.
  • NamingAuthorityConfigurationException
    • Naming authority reports a configuration error.
  • NamingAuthoritySecurityException
    • The identifier can not be resolved by anonymous users.
  • InvalidIdentifierException
    • The identifier does not exist.

Example:

import org.cagrid.identifiers.namingauthority.domain.IdentifierData;
import org.cagrid.identifiers.namingauthority.domain.KeyData;
import org.cagrid.identifiers.resolver.Resolver;

IdentifierData metadata = new Resolver().resolveGrid(identifier);

for(String key : metadata.getKeys()){
	KeyData data = metadata.getValues(key);
	
	System.out.println(key);
	System.out.println(data.getPolicyIdentifier());
	for(String value : data.getValues()){
		System.out.println(value);
	}
}

Since initializing a Resolver object is expensive, it is recommended to create the object only once when multiple identifiers are going be resolved.

Retrieval

The Retrieval process involves retrieving the object from the data owner's space using the identifier metadata obtained from the resolution process.

The framework retrieval process is driven by retrieval profile which defines two things:

  1. The metadata data types required to exist in the identifiers table maintained by the naming authority (note: without these, the profile can't be successfully executed)
  2. A formal definition of how to use the metadata to retrieve the data objects

The Retriever interface declares a single operation: retrieve.

Retrieve

Object retrieve (IdentifierValues ivs)

The purpose of this method is to retrieve a data object from the data owner's space. The framework has a built-in retriever (CQLRetriever) that allows a client to query a grid data service and get the CQLQueryResults.

The following example shows the implementation where the metadata (EPR and CQL) is de-serialized and the data service is queried to get the results. The query method at the end takes in a CQL Query, EPR, portName and queries the service running at the specified EPR to get the CQL Query Results.

public Object retrieve(IdentifierValues ivs) throws Exception{
	validateTypes(ivs);
	String[] eprStrs = ivs.getValues("EPR");
	String[] cqlStrs = ivs.getValues("CQL");
	
	//Deserialize EPR
	
	StringBufferInputStream fis = new StringBufferInputStream(eprStrs[0]);
	EndpointReferenceType endpoint = (EndpointReferenceType)ObjectDeserializer.deserialize(new InputSource(fis), EndpointReferenceType.class);
	
	//Deserialize query
	gov.nih.nci.cagrid.cqlquery.CQLQuery query = (gov.nih.nci.cagrid.cqlquery.CQLQuery) gov.nih.nci.cagrid.common.Utils.deserializeObject( new java.io.StringReader(cqlStrs[0]),gov.nih.nci.cagrid.cqlquery.CQLQuery.class);
	
	String endpointUrl = endpoint.getAddress().toString();
	String portName = endpoint.getPortType().getLocalPart();
	return query(query,endpointUrl,portName);
}

The RetrieverFactory Interface

The RetrieverFactory interface declares the getRetriever method, which takes either a name that uniquely identifies the retriever class, or the identifier's metadata. The method signatures are as shown:

Retriever getRetriever(String retrieverName)
Retriever getRetriever(IdentifierValues ivs)

Identifier adopters are free to provide different factory implementations with different retriever selection criteria. All factories must implement the RetrieverFactory interface.

The framework provides one such implementation: DefaultRetrieverFactory. It maintains a map of RetrieverImpl objects keyed by retriever name.

import java.util.Map;
import org.cagrid.identifiers.code.IdentifierValues;
import org.cagrid.identifiers.retriever.Retriever;
import org.cagrid.identifiers.retriever.RetrieverFactory;

public class DefaultRetrieverFactory implements RetrieverFactory{
	
	private Map<String,Retriever> retrievers;
	
	public DefaultRetrieverFactory(Map<String,Retriever> retrievers){ 
		this.retrievers = retrievers;
	}
	
	public Retriever getRetriever(IdentifierValues ivs) throws Exception{
		throw new Exception("Not implemented yet");
	}
	
	public Retriever getRetriever(String name) throws Exception{
		Retriever retriever = retrievers.get(name);
		if(retriever == null)
			throw new Exception("No retriever defined for ["+name+"]");
			
		return retriever;
		
		
	}
}

The getRetriever by IdentifierValues chooses the retriever instance whose ALL required keys (RetrieverImpl.getRequiredKeys()) exist in the identifier's metadata (IdentifierValues). If multiple retrievers meet this criteria, the one with the largest number of keys is chosen. If multiple retrievers have the same number of keys, an exception is thrown.

The RetrieverService Class

This class loads a RetrieverFactory from spring framework configuration file(s). The default constructor loads the default retriever factory name and configuration files. The specialized constructor can be used to specify a different factory name and/or configuration files.

Using Identifiers-Client to Resolve and Retrieve a Data Object

//Resolution
IdentifierValues ivs = new Resolver.resolveHttp(identifierStr);

//DataRetrieval
RetrieverFactory factory = new RetrieverService().getFactory();
Retriever retriever = factory.getRetriever("CQLRetriever");
CQLQueryResults results = (CQLQueryResults) retriever.retrieve(ivs);

In both cases, the first step is to resolve the identifier. That is, retrieve the identifier values (metadata).

The second overall step is to instantiate a Retriever object from the RetrieverFactory. The RetrieverService class loads a factory using the default spring configuration file identifiers-client-context.xml. Other spring files can be used by implementing the specialized RetrieverService constructor.

Last edited by
Sarah Honacki (772 days ago) , ...
Adaptavist Theme Builder Powered by Atlassian Confluence