XML Data Services Overview
caGrid XML Data Service Framework (xService) is an extension of caGrid for querying and retrieving XML documents managed in XML databases. xService provides extension to Introduce for flexible and rapid creation of caGrid Data Services from existing XML schemas (XSD) or UML models (XMI).
Background
To support XML based data model, query language and storage, we developed caGrid XML Data Service Framework - xService, a generalized extension framework for extending caGrid with XML support. xService provides automated data model mapping from XML Schema to caGrid Domain Model, query mapping from CQL to XML query language (XPath), and XML database interfaces for uploading, updating, querying and retrieving of XML data. There is also an extension to Introduce Toolkit for users to flexibly and rapidly create their own caGrid data services based on a predefined XML Schema.
- XPath is a language for addressing parts of an XML document, a standard defined for W3C. See http://www.w3.org/TR/xpath

- An XML database is a data persistence software system that allows data be stored and queried in an XML fashion. There are two major classes of XML database: XML enabled based on extension of traditional database systems and native XML based on XML based storage model.
Rationale
Because of its simple syntax, and a self-describing semantic structure, XML is fast becoming the standard information exchange language for web-based applications, and is ubiquitously used for data sharing and semantic interoperability in healthcare, life sciences and many other domains. As a result, commercial database vendors and research institutions are researching and developing the data persistence of XML documents through building native XML databases either through extension of traditional databases or new database. XML databases provide significant advantages as they support standard data definition languages based on XML standards such as XML Schema, and standard XML query languages such as XPath and XQuery. Furthermore, the XML-in and XML-out approach greatly simplifies the translation of data models and query languages. XML database technology is becoming mature, and XML database products are proliferating, such as Oracle Berkeley DB XML, Oracle XMLDB, IBM DB2 pureXML, eXist, Tamino, etc.
For traditional caGrid Data Service implementation, data model is mapped from UML based caGrid Domain Model to relational data model and query language is mapped from CQL to SQL through automated Hibernate based mapping. New XML based data standards such as Annotation and Image Markup Language, HL7, etc., is demanding an XML based approach for caGrid data sharing.






