i2b2 Integration Design Guide
| |
|
|
| |
Contents |
|
| |
|
|
Project Overview
The i2b2 Integration project is designed in a modular way to allow for thorough testing and ease of integration. It makes use of a simply operated off-line build, leveraging existing tools and available caGrid artifacts wherever possible.
Project Layout
The i2b2 Integration project contains several important directories, which are outlined here:
- ext
- Generated by Ivy during the build process
- Contains external artifacts (jars, schemas, resources) on which the i2b2 Integration project depends.
- lib
- Libraries on which the project depends which aren't provided by the Ivy artifact resolution process.
- resources
- Assorted resources which are useful for testing the project and getting started using the i2b2 / ontomapper style.
- jdbc
- Contains JDBC drivers for use with the i2b2 query processor.
- sybase
- Contains the Sybase/JTDS driver jar file and a text document with connection information used to access the sybase i2b2 installation at UCSF
- models
- Contains XMI and caGrid Domain Models for the preliminary i2b2 testing projects (miniOCRe)
- queries
- Known-working CQL test queries against the models in the resources directory.
- schema
- Schemas which define data types used by this project and / or i2b2 go in here.
- sql
- SQL schemas are placed in this directory.
- src
- Contains the source code of the project
- java
- Contains all Java source files
- processor
- Contains the implementation of the CQL Query Processor and related functionality
- style
- Contains the data service style implementation, including the wizard, and code generation helper tooling.
- utils
- Contains utilities and tools used by the style and / or processor.
- style
- Contains the data service style definition document (style.xml)
- test
- Contains testing for the project, including the Ant test build script (test.xml)
- src
- Contains source code used for testing the project
- java
- Contains the Java source code used for testing the project. Typically JUnit tests and supporting tooling.
- resources
- Contains any resources used for testing. Examples include "gold" XML documents for comparison to test results, packaged services for deployment, etc.
Additionally, the root directory of the project contains the primary Ant build script (build.xml), the Ivy configuration files, build configuration files (*.properties), and the Eclipse .project and .classpath documents. The latter makes use of library variables to point to external dependencies (Globus) rather than an system-specific path.
Project build process
The project can be built using Apache Ant from the command line:

%> ant all
The first time the project is built, Ivy will retrieve the required build artifacts from the remote caGrid 1.3 repository. Subsequent builds will only retrieve artifacts if they are new, or have been removed from the local ext directory.
The build process creates the build directory, which contains the resulting Java class files and Jar libraries.
Test process
The project can be tested locally using Apache Ant from the command line:
This command assumes the project has already been built, and will fail otherwise.

%> ant test
The JUnit tests will be executed and the results both printed to the console and written to documents in the test/logs/junit directory. Execution and reporting of this command could be automated with a continuous integration system like Hudson or Cruise Control.
Implementation Details
Concept Code Mapper
The Concept Code Mapper is an Interface which provides a mechanism for retrieving the concept codes associated with a given class or attribute of a class.
Domain Model Concept Code Mapper
The Domain Model Concept Code Mapper is a concrete implementation of the Concept Code Mapper interface which derives the information it provides from a caGrid Domain Model. This reference implementation is validated by the project's testing.
I2B2 Query Factory
The i2b2 Query Factory class generates SQL statements for the various queries needed to execute CQL against an i2b2 database. It optionally takes a table name prefix which is applied to each query. This allows multiple i2b2 installations to share the same database but not conflict with each others tables.
| This is apparently how some people (DBAs who don't know better??) like to install multiple instances, rather than creating a new DB schema with its own users and whatnot. Seems like it would lead to problems with password sharing and opens the potential for rogue DROP statements to cause problems... |
I2B2 Data Access Manager
The i2b2 Data Access Manager class handles all actual database access operations for the query processor.
Database Connection Source
This is an interface which the query processor uses to obtain JDBC connections. It provides only two methods: getConnection(), which returns a JDBC connection, and shutdown() which provides a callback from the query processor to the connection source indicating that any resources it has allocated should be released.
Pooled Database Connection Source
A concrete implementation of the database connection source interface which makes use of Apache's Database Connection Pooling (DBCP) API to provide a managed pool of JDBC connections.
Querying the i2b2 database
Schema Details
i2b2 MySQL database schemas
The i2b2 database schemas from the i2b2 distribution are written for Oracle and MSSQL. The schemas included in this project have been ported to MySQL 5 and edited to be compatible with the contents of UCSF's i2b2 installation on a Sybase IQ database.
Two .sql scripts are included in the project to reproduce the database schema:
- i2b2-mysql-structure.sql
- This script creates the tables and relevant primary keys used by the database.
- i2b2-mysql-foreign_keys.sql
- This script connects the tables created by the structure schema together using foreign key constraints. It is meant to be executed after importing data into the tables.





