This guide will walk you through the steps to create an information model using model creation and semantic annotation tools. The development of a caGrid data service that enables access to a database managed by a relational database system can be accomplished in four main steps:
- Creation of an information model. The information model consists of two components, an object model and a data model. The object model presents an object-oriented view of the backend database and is the Grid-level representation for querying and retrieving data through the data service. The data model represents the relational schema of the database. The object model is mapped to the data model so that object-oriented queries can be translated into relational database queries.
- Semantic annotation of the object model. This step is needed to facilitate semantic interoperability. The object model allows for syntactic interoperability and programmatic access to the data resource (the database) via a common, well-defined, published representation layer. The semantic annotation of the object model enables a third-party consumer of the data source to understand the meanings of the data elements in the model and correctly consume them. Re-use already registered data elements in NCI's caDSR and openMDR metadata registry for interoperability. The information model is annotated with proper Common Data Elements (CDE's) from NCI's caDSR or/and local openMDR instance.
- Creation of Domain Model File using Domain Model Generator. This step is required to resolve the Common Data Elements (CDE's) from various sources - caDSR or openMDR and create a semantically annotated domain model file represented with standard concepts.
- Creation of a caGrid data services using the Introduce toolkit. This step creates a data service interface with underlying runtime environment and client APIs using the data-oriented system generated by the caCORE SDK.
The following tutorial is designed to lead the user through the steps involved in creating an annotated information model that can be ingested by the Domain Model Generator and Introduce to create semantically annotated grid services.
A sample file has been included with openMDR 1.1 distribution located at ($OPENMDR_HOME/projects/mdrDomainModelGenerator/Test.EAP). This file is a .EAP file which can be opened in Enterprise Architect application. The file has been created after following the below mentioned steps of Creating an Information Model (uml project, data model, object model, object-relational mapping), and Semantically Annotate Model (semantic annotation of class attributes in object model with common data elements (CDE's) from caDSR).
This guide is designed for users who would like to understand the use of various tools employed for developing annotated information models that can be consumed by the Domain Model Generator and Introduce to create semantically annotated grid services. The guide assumes that you are familiar with the caBIG caCORE modeling process. More information about this process can be found at the following URL: caCORE Overview, caBIG Model Creation Guide, caCORE SDK Programmers guide
Instructions are available in the caBIG Model Creation Guide for the following steps:
There is NO need to follow instructions "4.5 Semantic Annotation of Model" from caBIG Model Creation Guide. Instead Follow the openMDR workflow as explained below.
The Semantic Annotation of Model step is described in this guide in order to create grid services using openMDR.
Several tools will be utilized to execute the steps described in this Section. These tools are:
- Enterprise Architect version 7.5
- Enterprise Architect is a UML modeling software system available from Sparx Systems.
Enterprise Architect is the recommended tool.
- Enterprise Architect is a UML modeling software system available from Sparx Systems.
- openMDR EA Plugins
- openMDR EA Plugins is a tool that runs within EA and facilitates semantic annotation of your logical model. Here is a link to the openMDR EA Plugins.
- mdrQuery Service
- mdrQuery Service provides An API and Grid Service for querying across many disparate semantic metadata repositories and using the information for data model annotation.
- 1. View the Logical Model: Double-click on "Logical Model" within diagrams to view the graphical diagram.
- 2. View the Data Model: Double-click on "Data Model" within diagrams to view the graphical diagram.
- 3. Click on a class attribute (birth date in the example below) and right click to find "Add-In" for MDR Query Service Panel. Click to begin annotating!
- 4. MDR Query Service Panel shows various options:
- Service URL at the bottom of the panel - enter the mdrQuery Service URL and click connect. A history will be maintained for future use.
- Resource selection: Once the plugin has connected to the service URL, it is going to display a list of resources in the drop-down at the top of the panel. caDSR and openMDR are the default resources available to query.
- Search by - term, exact term or id from the drop down and enter the search value in the text box.
- Context - This is optional. If you wish to query within a context, you can get a list for the respective resource selected above. This will run the query only within the contents of a specific context. If you wish to run a broad query across context, please do not select any specific value in the drop-down.
- Results panel - The results panel displays a list of all the CDE's that match the query term/id. A user can click on different CDE's and view a more detailed description of the CDE in the bottom tabs - Definition, Props/valiues, Object Class/Pros/ Other.
- Annotate with CDE - Once an appropriate CDE is selected, click on "Annotate with CDE" button to annotate the selected attribute with the CDE reference and preferred name. The tags - CERef and preferred name are automatically added. This process needs to continue with all attributes.
- 5. Clicking on the "Annotate with CDE" button brings up a success pop-up message.
- 6. Change the resource to now query the local content (CDE's) created in your local openMDR metadata registry. Please follow the same steps as mentioned above for caDSR. See some screenshots below.
- 7. Export .xmi file - After the entire information model is annotated with CDE's from caDSR and openMDR, the next step is to take an export of the .xmi file, so that it can be consumed by i) Domain Model Generator to create the domain model file, and ii) Introduce to generate the schema file and grid service.
The Domain Model Generator application cannot consume models in the .EAP type file that Enterprise Architect uses, so we must export the models in an xmi format. Right-click on the "Logical View" entry in the Project Browser pane of the EA window, and select "Export Model to xmi".
Note: It is important to use the "Logical View" node to export, rather than the top level package; incompatibilities will cause errors while creating grid service.
EA will pop up a dialog box to configure the export. The options can be left as the defaults.
Simply choose the location of the exported xmi file. See the following screen shots.
- Click on Logical View
- Right Click to export to .xmi file
- Provide a directory and file name where you wish to store the .xmi file and Click on "Export".
- Export complete - displayed in the picture below.
You can use the model file in .xmi format as exported above, to generate the domain model file (XML format). Using command line interface, go to the mdrDomainModelGenerator project in openMDR checkout root directory (referred from now on as OPENMDR_HOME).
If you wish to configure your mdrDomainModelGenerator run against your local mdrQueryService installation, run the following task. You will need to provide the Query service URL and port as inputs.
To generate the domain model file (xml format), you need to run the default task "ant run". The system will prompt you for the following:
1. Information Model (.xmi) file to be Parsed (File Name with location)
2. Domain model File Name to be generated (File Name(.XML) with location)
3. Project Name
4. Project Version
The system will then resolve the CDE's using mdrQuery Service and generate the domain model file.
See example below:
Please follow caCORE SDK Developer's 4.1.1 Guide to create a system using SDK by supplying your UML model as input.
- You will need to put your UML Model xmi file in the models directory within caCORE SDK root directory.
- You will need to edit the deploy.properties in the "conf" directory as per your environment. You need to change the values for PROJECT_NAME, MODEL_FILE, INCLUDE_PACKAGE, SERVER_URL.
- You will have to run the task "ant build-system". This will create all the artifacts in the project name directory within the "output" directory.
The above caCORE SDK directory path will be required by introduce as specified below in "Step 5 Selecting the caCORE SDK Directory".
- Select Create caGrid Service Skeleton from the toolbar at the top of the Introduce portal, or select Tools -> Create caBIG Service from the menu. The Create caBIG Grid Service screen will appear.
- Choose a directory in which to place the generated service. This tutorial uses "/Users/dhav01/work/data/services/MDRDataSvc".
- Type in the Service Name. This tutorial will use "MDRTestDataSvc".
- Choose a Package Name. This tutorial will use "org.cagrid.openmdr".
- Choose a Namespace. This tutorial will use "http://openmdr.cagrid.org/MDRTest".
NOTE: The service creation dialog will attempt to guess this for you from your package and service names.
- Select the Data Service radio button from the Customize Service section.
- Click the Create button located at the bottom of the creation screen.
The caGrid data services extension provides a pluggable framework for creating highly custom data services known as data service styles. Styles may be provided by a third party, or installed with caGrid. Styles are provided to create data services backed by various versions of the caCORE SDK.
The caCORE SDK 4.1 style provides a wizard-like interface with multiple steps which prompts the service developer for information required to create and configure the data service to use a caCORE SDK system.
- When the Data Service Configuration window appears, select "caCORE SDK v 4.1" from the drop down menu.
- Leave the check boxes for WS-Enumeration and Bulk Data Transport unchecked.
These options enable specialized results retrieval interfaces for the data service.
- Click the OK button to continue with the selected service style.
- A progress bar will be displayed while service skeleton is being created.
The first screen in the wizard will appear after clicking "OK" on the style selection dialog. This screen simply informs the service developer of what they will be doing with the wizard and what sort of information will be needed.
The next step prompts the service developer to select the caCORE SDK directory which contains the pre-built application which they would like to expose to the grid.
Simply use the "Browse" button to select the base directory of your local caCORE SDK installation, and the wizard will locate the build artifacts it requires. The wizard loads the conf/deploy.properties file into the table on this page, and values found in it are used to locate the rest of the artifacts on which this style depends.
Click Next: API Type to continue.
This step provides a choice of using the local or remote API to communicate with the caCORE SDK application service.
Use of the local API might improve performance of queries which retrieve large amounts of data, with the caveat that the grid data service must be deployed to the same machine as the caCORE application is. In the case of the remote API, the developer must specify the hostname of the machine on which the caCORE application is running, and on which port to contact it. For example, if the service has been installed to the machine with the hostname "localhost" and listens for connections on port 9090, the developer would specify "localhost" as the host name, and "9090" as the port.
Click Next: Login to move forward.
This step allows the service developer to supply log in information to the caCORE SDK Application Service in the form of a username and password combination. If these values are supplied, the caCORE SDK Application Service API will be initialized with them when the grid data service starts up.
Leave the Use Login check box unchecked and all other fields on this page blank for this tutorial.
Click Next: Domain Model to advance.
This step step of the wizard requires the service developer to supply the domain model which the new data service will expose to the grid. Domain Models define the classes, their attributes, and relationships such that data services may be discovered by the types they expose, and CQL queries can be formulated without a priori knowledge of an arbitrary data service's contents.
The domain model panel contains a drop-down menu at the top which specifies the Domain Model Source. This wizard provides three potential sources for the domain model:
This setting directs the wizard to convert the .xmi model which it detects is used by the caCORE SDK system specified on step 2 as the domain model. The .xmi will be converted to a domain model XML document.
This option allows the service developer to specify a pre-generated domain model XML document from the local file system. This might prove useful for developers who have previously generated a data service and wish to re-use the model, or have created a custom model.
This option lets the service developer select a project and packages from the cancer Data Standards Repository (caDSR) and generate a domain model extract from it for use in their data service.
For this tutorial, select Default .xmi as the Domain Model Source.
The wizard will attempt to populate as many of the fields on this page as possible, and the ones it cannot will be highlighted in red. Typically, this will only be the Project Version field. Only the Project Name and Project Version fields are editable on this page.
For this tutorial, enter 1.0 in the Project Version field. Leave the other fields as they are.
Click Next: Schemas to convert the .xmi file to a domain model and continue to the last step of the wizard.
Every class exposed by the domain model must be supplied with a corresponding XML schema representation so it may be utilized in the grid. The mapping panel of this wizard streamlines this process by simultaneously generating the mapping from model to schema and configuring serialization of the XML data types to correspond to the domain model's Java beans.
Packages from the domain model are listed, the current mapping status of the package, and a button to manually resolve the mapping for each package. The Automatically Map From SDK Generated Schemas button makes use of the XML schemas provided by the caCORE SDK's output to map to each package of the domain model.
1. Click the Automatically Map From SDK Generated Schemas button to perform the mapping from domain model packages to XML schema.
2. At this point, the data service wizard has gathered all the information required to create a fully functionaly caGrid data service backed by the caCORE SDK system selected in step 2, and the Next: button will change to say Done.
3. Click Done to complete the wizard. You will see progress bar indicating post creation processes and loading service description.
4. The default window displayed would be as below...
Click on "Data Service" tab.
1. The default path for annotated domain model file is displayed. Browse to select the Domain Model File generated from Domain Model Generator.
3. Click on "Done" to save. This will prompt you to save. Click on "Save". This is going to synchronize and rebuild the service skeleton.
4. Once the above operation ends, you will see populated values in UML Class Selection
- Click on "Services" tab to see a listing of your newly created Data Service. In the panel, click on Service Metadata under Resource Properties. Click on Edit Resource Property.
- Fill in the details on "Resource Property Editor" under "Hosting Center" - Points of Contact and Address information, and click "Done"
- Fill in the details on "Resource Property Editor" under Service Information - Points of Contact, and click "Done"
- Click on "Save" on the Services tab
- Go to the root your services directory where you created the data service and deploy it to tomcat container. You may create a new tomcat container just like the one created for mdr Query Service in order to deploy your newly created data service. Please note that you should follow instructions on caGrid wiki to configure a container secure/unsecure specific to your requirement. The environment variable $CATALINA_HOME will be read in order to deploy the service to.
- Restart your tomcat server on which you deployed the newly created data service as above and point your browser to the following URL: http://<serverHost>:<serverPort>/wsrf/services/cagrid/MDRTest
- You can also view the list of services deployed on your tomcat at the following URL: http://<serverHost>:<serverPort>/wsrf/servlet/AxisServlet