Distributed Common Query Language
| |
|
|
| |
Table of Contents
|
|
| |
|
|
Overview of DCQL
DCQL is an XML-based language used to express federated queries. It is an extension to CQL, the language used to query individual data services.
Both CQL and DCQL describe queries in a declarative, non-procedural way that is based on the data services' UML-based domain model. That is, a query can be seen as a request for instances of a class in a model. The query identifies this class as its target class.
Queries can request all instances of the target class by not placing any restrictions on which instances should be returned. However, it is common for queries to include restrictions on which instances of the target class should be returned. These restrictions can take two forms.
Firstly the returned instances of the target class can be restricted to those that have an association with an instances of another (or possibly the same) class. The instances of those classes can be further restricted to those that have other associations. There is no limit to how deeply the requirements for associations can be nested.
The other way that a DCQL or CQL query can restrict which instances are returned is by requiring the values of their attributes to have certain values. For example, a query might specify that instances of a target class must have a version attribute that is equal to 3. The query might further specify that the returned instances of the target class must be associated with an instance of a class named Subscription that has an expiration date that is less than January 31, 2011.
DCQL extends CQL in two ways. The service that provides the results for a CQL query is not specified in the query and is assumed to be the service that the query is sent to. DCQL queries include a list of target services to be queried.
The associations specified in a CQL query are all withing a single data service. DCQL allows virtual associations (joins) to be specified between data services.
Services accepting DCQL (such as the FQP service), generally don't expose any local data.
Details
As DCQL is modeled as an extension to CQL, it mirrors the general structure wherein a target object Class is identified, and instances are specified by restrictions over its attributes and associated objects. The specifics of each component of a DCQL query are described below.
DCQLQuery
The root of a DCQL Query is the DCQLQuery object. It contains a collection of data service URLs which identify the services that should be queried for instances of data matching the Objects described by the TargetObject. This mechanism provides the basic support for simple aggregation queries, as each identified service will queried for the identified data, and the results will be aggregated and returned to the client. DCQL is a recursively defined language, as queries are ultimately descriptions of instance objects and relationships between associated objects can be described using the same mechanism. That is, just as the target object is described by restrictions of its attributes and associations to other objects, the associated objects are defined using the same syntax. These restrictions are defined below by the Object type, as the DCQLQuery's TargetObject is just an instance of this type.
Object
The Object type is the core component to CQL and therefore DCQL. Firstly it identifies the Class of targeted instances. That is, all data instances matching the description will be of the class type identified by the "name" of the Object. When used as the TargetObject of a DCQL query, it describes the return type of the results of the query. In addition to specifying the class of objects, the Object type provides the basic recursive structure of DCQL and CQL queries. That is, an object is described through a restriction over one of its attributes with the Attribute type, an association to another object with the Association type, a relationship to another object on a remote data service with the ForeignAssociation type, or a logical grouping of two or more of those types using the Group type. For example, a Group of several restrictions over attributes and associations can be specified. Each of these types are described below.
Attribute
The Attribute type is the simplest of restriction types, and the terminator of query recursion (as it allows no children). DCQL simply makes use of the CQL Attribute type; its syntax and semantics are the same. The type allows basic restriction of a single attribute of an object. The restriction is expressed as name, value, and predicate. The name component defines the name of the attribute. The value component defines the expected value of the attribute, with respect to the predicate. The predicate is the operator that should be used to evaluate whether or not a given attribute instance matches the specified value. For example, an Attribute with name="size", value="5", and predicate="LESS_THEN" would restrict the results to contain data instances which had an attribute called "size" with a value of less than 5. The predicate values are generally self-descriptive: "EQUAL_TO", "NOT_EQUAL_TO", "LIKE", "LESS_THAN", "LESS_THAN_EQUAL_TO", "GREATER_THAN", and "GREATER_THAN_EQUAL_TO." Two additional predicates, "IS_NULL", and "IS_NOT_NULL" check only for the presence or absence, respectively, of an attribute, and do not restrict its value at all. Therefore, any value attribute will be ignored when using these predicates. "EQUAL_TO" is the default predicate and so an Attribute need not explicitly specify predicate="EQUAL_TO" to define equivalence restrictions.
Association
The Association type is used to describe a relationship, or association, between the containing object and the object identified by the Association type. The type is an extension of the Object type, in that it too describes an object (the associated object). The Association type is always used in the context of a containing Object type. The containing object is the source of the UML association, and the object described by the Association type is the target. Beyond everything it inherits from the Object type, the Association type introduces an additional attribute called roleName. The roleName attribute is optional, and can be used to name the role the target object plays in the UML association. The roleName can be omitted if the UML information model only describes a single association between the source and target Classes. If more than one association between the two classes is present in the model, then the roleName must be used to disambiguate the relationship. The query is considered invalid if the roleName is omitted and multiple associations between the two classes exist in the model.
ForeignAssociation
The primary distinction between DCQL and CQL is the addition of the ability to identify data of interest through relationships with data on remote services. The ForeignAssociation type provides this means. Similar to the Assocaition type, this type is only used within the context of a containing Object type, and contains a description of another Object type; in this case the ForeignObject. The type describes a relationship between the containing object and object on another data service. The objects in the other data service are identified by both the targetServiceURL attribute of the ForeignAssociation, which identifies the remote data service, and the ForeignObject which is just an Object type that defines the desired instances just as "local" queries do. Conceptually, the ForeignAssociation results in a new CQL query being sent to the data service running at targetServiceURL with the Target of the query being the ForeignObject. This again shows how DCQL just makes use of minor extensions to CQL to express the notions of aggregation and joins. In addition to the notion of describing a query to a remote service, the way in which the "results" should relate to the containing object must also be defined. For this purpose, DCQL uses a JoinCondition type, which is described below. In this way, the ForeignAssociation type describes a query to a remote data service, and defines that the containing objects which should be kept are those that meet the requirements of the JoinCondition when compared against the results of the remote query. In terms of database query languages, the ForeignAssociation type is roughly equivalent to an SQL subselect.
JoinCondition
Used only as a part of a ForeginAssociation, the JoinCondition type specifies the desired relationship between the ForeignObject and the containing Object type. It currently supports only single simple attribute comparisons. The type is similar to the Attribute type, in that a predicate is used to identify the relationship between the entities, but differs slightly in that rather than a value being specified as one entity, both entities are attribute names. That is, while an Attribute type compares an attribute of its containing Object to a specified value, the JoinCondition type compares the containing Object to another Object (that which is identified by the ForeignAssociation's ForeignObject). The syntax of the JoinCondition is such that the containing Object's attribute is named by the required localAttributeName attribute, the ForeignObject's attribute is named by the required foreignAttributeName, and the predicate is named by the optional predicate attribute. Just like the Attribute type, the predicate is optional, and "EQUAL_TO" is the default value when omitted. The set of allowable predicate values is the same as for the Attribute type, with the exception of the "IS_NULL" and "IS_NOT_NULL" values being disallowed, as they are only applicable against a single attribute.
Group
The Group type provides the capability to express two or more constraints. Whenever a single constraint (Object, Association, ForeignAssociation, or Attribute) needs to be combined with one or more additional contrants, a Group must be used to express their relationship with eachother. This relationship is described by the logicalRelation attribute of the Group. The logicalRelation attribute can assume the value of either "OR" or "AND". If the value is "AND" all of the contained constraints must be met for the Group constraint to be met. If the value is "OR" only one constraint must be met for the Group constraint to be met. In addition to grouping other constraints, the Group type can also contain nested Groups. This simple construct allows for arbitrarily complex constraints to be modeled.
An Example Query
An example DCQL query, represented in XML, is shown below. In this fictitious example, a PersonRegistry Data Service is joined with a StudyRegistry Data Service. The query specifies Persons in the PersonRegistry should be returned which have an "ssn" that is equal to that of a Participant's "patientSSN" and the Participant should have an "age" greater than 18. The specification of the target service can be seen on line 18 in the example (in this case only one service is targeted, though may could have been listed). Additionally, the "join" is specified starting on line 6, wherein the second target service is identified, and the join condition is defined. The join condition creates a link between the containing Object (in this case, Person), and an Object (in this case Participant, as defined on line 10) in the second target service. The condition specifies a predicate to be evaluated against an attribute in each of the two linked Objects (in this case Person.ssn and Participant.patientSSN). It is worth noting that as DCQL is a recursive language, the ForeignObject defined on line 10 could have also specified a join to a third Data Service, or other more complex criteria.
Unable to render embedded object: File (DCQL_Example1.png) not found.
Schema
*http://caGrid.caBIG/1.0/gov.nih.nci.cagrid.dcql*![]()
| NOTE: The schema for CQL (http://CQL.caBIG/1/gov.nih.nci.cagrid.CQLQuery |
<xsd:schema targetNamespace="http://caGrid.caBIG/1.0/gov.nih.nci.cagrid.dcql" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:dcql="http://caGrid.caBIG/1.0/gov.nih.nci.cagrid.dcql" xmlns:cql="http://CQL.caBIG/1/gov.nih.nci.cagrid.CQLQuery" elementFormDefault="qualified" attributeFormDefault="unqualified"> <xsd:import namespace="http://CQL.caBIG/1/gov.nih.nci.cagrid.CQLQuery" schemaLocation="./xsd/Data/1_gov.nih.nci.cagrid.CQLQuery.xsd"/> <xsd:element name="DCQLQuery"> <xsd:annotation> <xsd:documentation>caGrid Distributed CQL Query, the desired result objects are described by the TargetObject, and the CQL query resulting from the processing is sent to each data service identified by the targetServiceURLs</xsd:documentation> </xsd:annotation> <xsd:complexType> <xsd:sequence> <xsd:element name="TargetObject" type="dcql:Object"/> <xsd:element name="targetServiceURL" type="xsd:string" maxOccurs="unbounded"> <xsd:annotation> <xsd:documentation>The URL of a data service which should be sent the resulting CQL query.</xsd:documentation> </xsd:annotation> </xsd:element> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:complexType name="Object"> <xsd:annotation> <xsd:documentation>Description of an Object instance</xsd:documentation> </xsd:annotation> <xsd:choice> <xsd:element name="Attribute" type="cql:Attribute" minOccurs="0"> <xsd:annotation> <xsd:documentation>The description of the object being targeted by the query; the return type.</xsd:documentation> </xsd:annotation> </xsd:element> <xsd:element name="Association" type="dcql:Association" minOccurs="0"/> <xsd:element name="ForeignAssociation" type="dcql:ForeignAssociation" minOccurs="0"/> <xsd:element name="Group" type="dcql:Group" minOccurs="0"/> </xsd:choice> <xsd:attribute name="name" type="xsd:string" use="required"/> </xsd:complexType> <xsd:complexType name="Association"> <xsd:annotation> <xsd:documentation>Describes an optionally (role-)named relationship from this Object to another.</xsd:documentation> </xsd:annotation> <xsd:complexContent> <xsd:extension base="dcql:Object"> <xsd:attribute name="roleName" type="xsd:string" use="optional"/> </xsd:extension> </xsd:complexContent> </xsd:complexType> <xsd:complexType name="Group"> <xsd:annotation> <xsd:documentation>A collection of two or more sub-constraints, grouped together by the logicalRelation.</xsd:documentation> </xsd:annotation> <xsd:choice minOccurs="2" maxOccurs="unbounded"> <xsd:element name="Association" type="dcql:Association" maxOccurs="unbounded"/> <xsd:element name="Attribute" type="cql:Attribute" maxOccurs="unbounded"/> <xsd:element name="ForeignAssociation" type="dcql:ForeignAssociation" maxOccurs="unbounded"/> <xsd:element name="Group" type="dcql:Group" maxOccurs="unbounded"/> </xsd:choice> <xsd:attribute name="logicRelation" type="cql:LogicalOperator" use="required"/> </xsd:complexType> <xsd:complexType name="ForeignAssociation"> <xsd:annotation> <xsd:documentation>An association or relationship defined between this Object and an Object defined by the ForeignObject, located on the data service located at the targetServiceURL.</xsd:documentation> </xsd:annotation> <xsd:sequence> <xsd:element name="JoinCondition" type="dcql:JoinCondition"/> <xsd:element name="ForeignObject" type="dcql:Object"/> </xsd:sequence> <xsd:attribute name="targetServiceURL" type="xsd:string" use="required"/> </xsd:complexType> <xsd:complexType name="JoinCondition"> <xsd:annotation> <xsd:documentation>Specifies a relationship, defined by the predicate, between a local attribute and a remote attribute.</xsd:documentation> </xsd:annotation> <xsd:attribute name="predicate" type="dcql:ForeignPredicate" use="optional" default="EQUAL_TO"/> <xsd:attribute name="localAttributeName" type="xsd:string" use="required"/> <xsd:attribute name="foreignAttributeName" type="xsd:string" use="required"/> </xsd:complexType> <xsd:simpleType name="ForeignPredicate"> <xsd:annotation> <xsd:documentation>Predicate types used for attribute comparisons</xsd:documentation> </xsd:annotation> <xsd:restriction base="xsd:string"> <xsd:enumeration value="EQUAL_TO" id="equal_to"> <xsd:annotation> <xsd:documentation>Two values are equivalent.</xsd:documentation> </xsd:annotation> </xsd:enumeration> <xsd:enumeration value="NOT_EQUAL_TO" id="not_equal_to"> <xsd:annotation> <xsd:documentation>Two values are not equivalent.</xsd:documentation> </xsd:annotation> </xsd:enumeration> <xsd:enumeration value="LESS_THAN" id="less_than"> <xsd:annotation> <xsd:documentation>The first value is less than the second.</xsd:documentation> </xsd:annotation> </xsd:enumeration> <xsd:enumeration value="LESS_THAN_EQUAL_TO" id="less_than_equal_to"> <xsd:annotation> <xsd:documentation>The first value is less than, or equivalent to, the second.</xsd:documentation> </xsd:annotation> </xsd:enumeration> <xsd:enumeration value="GREATER_THAN" id="greater_than"> <xsd:annotation> <xsd:documentation>The first value is greater than the second.</xsd:documentation> </xsd:annotation> </xsd:enumeration> <xsd:enumeration value="GREATER_THAN_EQUAL_TO" id="greater_than_equal_to"> <xsd:annotation> <xsd:documentation>The first value is greater than, or equivalent to, the second.</xsd:documentation> </xsd:annotation> </xsd:enumeration> </xsd:restriction> </xsd:simpleType> </xsd:schema>





