As Semantic Web technologies are getting mature, there is a growing need for RDF applications to access the content of huge, live, non-RDF, legacy databases without having to replicate the whole database into RDF. This document describes the D2RQ mapping language for treating non-RDF relational databases as virtual RDF graphs, and the D2RQ Platform that enables applications to access these graphs through the Jena and Sesame APIs, as well as over the Web via the SPARQL Protocol and as Linked Data.
This document describes the D2RQ Platform for accessing non-RDF, relational databases as virtual, read-only RDF graphs. D2RQ offers a variety of different RDF-based access mechanisms to the content of huge, non-RDF databases without having to replicate the database into RDF.
Using D2RQ you can:
The D2RQ Platform consists of:
The figure below depicts the architecture of the D2RQ Platform:
The D2RQ Engine is implemented as a Jena graph, the basic information representation object within the Jena framework. A D2RQ graph wraps a local relational databases into a virtual, read-only RDF graph. It rewrites Jena or Sesame API calls, find() and SPARQL queries to application-data-model specific SQL queries. The result sets of these SQL queries are transformed into RDF triples or SPARQL result sets that are passed up to the higher layers of the framework. The D2RQ Sesame interface wraps the D2RQ Jena graph implementation behind a Sesame RDF source interface. It provides a read-only Sesame repository interface for querying and reasoning with RDF and RDF Schema.
D2R Server is a tool for publishing relational databases on the Semantic Web. It enables RDF and HTML browsers to navigate the content of the database, and allows applications to query the database using the SPARQL query language. D2R Server builds on the D2RQ Engine. For detailed information on how to set up D2R Server please refer to the separate D2R Server website.
Example
We are using an example database which stores information about conferences, papers, authors and topics throughout this manual. The database is mapped to the International Semantic Web Community (ISWC) Ontology.
The D2RQ Plaffrom has been tested with these database engines:
The D2RQ Platform comes with two command line tools: a mapping generator that creates a default mapping file by analyzing the schema of an existing database, and a dump script that writes the complete contents of a database into a single RDF file. The scripts work on Windows and Unix systems.
The generate-mapping script creates a default mapping file by analyzing the schema of an existing database. This mapping file can be used as-is or can be customized.
generate-mapping [-u username] [-p password] [-d driverclass] [-o outfile.n3] [-b base uri] jdbcURL
JDBC connection URL for the database. Refer to your JDBC driver documentation for the format for your database engine. Examples:
MySQL: jdbc:mysql://servername/databasename
PostgreSQL: jdbc:postgresql://servername/databasename
Oracle: jdbc:oracle:thin:@servername:1521:databasename
Microsoft SQL Server: jdbc:sqlserver://servername;databaseName=databasename (due to the semicolon, the URL must be put in quotes when passed as a command-line argument in Linux/Unix shells)
The fully qualified Java class name of the database driver. The jar file containing the JDBC driver has to be in D2RQ's /lib/db-drivers/ directory. Drivers for MySQL and PostgreSQL are provided with the download, for other databases a driver has to be downloaded from the vendor or a third party. To find the driver class name, consult the driver documentation. Examples:
MySQL: com.mysql.jdbc.Driver
PostgreSQL: org.postgresql.Driver
Oracle: oracle.jdbc.OracleDriver
Microsoft SQL Server: com.microsoft.sqlserver.jdbc.SQLServerDriver
Example invocation for a local MySQL database:
generate-mapping -d com.mysql.jdbc.Driver -u root jdbc:mysql://127.0.0.1/iswc
The dump-rdf script provides a way of dumping the contents of the whole database into a single RDF file. This can be done with or without a mapping file. If a mapping file is specified, then the script will use it to translate the database contents to RDF. If no mapping file is specified, then the script will invoke generate-mapping and use its default mapping for the translation.
dump-rdf -m mapping.n3 [output parameters]
If no mapping file is provided, then the database connection must be specified on the command line. With the exception of fetchSize, the meaning of all parameters is the same as for the generate-mapping script.
dump-rdf -u username [-p password] -d driverclass -j jdbcURL [-f fetchSize] [output parameters]
Several optional parameters control the RDF output:
Example invocation using a mapping file:
dump-rdf -m mapping-iswc.n3 -f N-TRIPLE -b http://localhost:2020/ > iswc.nt
This section describes how the D2RQ Engine is used within the Jena 2 Semantic Web framework.
Download
D2RQ can be downloaded from http://sourceforge.net/projects/d2rq-map/
Jena Versions
At the time of writing, the latest Jena release is version 2.4. D2RQ requires a more recent custom-built version of Jena, the version that ships with ARQ 1.4. All required jar files are included in the D2RQ distribution. (Jena 2.4 and 2.3 may work to some extent.)
Installation
Debugging
D2RQ uses the Apache Commons - Logging API for logging. To enable D2RQ debug messages, set the log level for logger de.fuberlin.wiwiss.d2rq to ALL. An easy way to do this is:
org.apache.log4j.Logger.getLogger( "de.fuberlin.wiwiss.d2rq").setLevel( org.apache.log4j.Level.ALL);
The ModelD2RQ class provides a Jena Model view on the data in a D2RQ-mapped database. The example shows how a ModelD2RQ is set up using a mapping file, and how Jena API calls are used to extract information about papers and their authors from the model.
The ISWC and FOAF classes have been created with Jena's schemagen tool. The DC and RDF classes are part of Jena.
// Set up the ModelD2RQ using a mapping file Model m = new ModelD2RQ("file:doc/example/mapping-iswc.n3"); // Find anything with an rdf:type of iswc:InProceedings StmtIterator paperIt = m.listStatements(null, RDF.type, ISWC.InProceedings); // List found papers and print their titles while (paperIt.hasNext()) { Resource paper = paperIt.nextStatement().getSubject(); System.out.println("Paper: " + paper.getProperty(DC.title).getString()); // List authors of the paper and print their names StmtIterator authorIt = paper.listProperties(DC.creator); while (authorIt.hasNext()) { Resource author = authorIt.nextStatement().getResource(); System.out.println("Author: " + author.getProperty(FOAF.name).getString()); } System.out.println(); }
In some situations, it is better to use Jena's low-level Graph API instead of the Model API. D2RQ provides an implementation of the Graph interface, the GraphD2RQ.
The following example shows how the Graph API is used to find all papers that have been published in 2003.
// Load mapping file Model mapping = FileManager.get().loadModel("doc/example/mapping-iswc.n3"); // Set up the GraphD2RQ GraphD2RQ g = new GraphD2RQ(mapping, "http://localhost:2020/"); // Create a find(spo) pattern Node subject = Node.ANY; Node predicate = DC.date.asNode(); Node object = Node.createLiteral("2003", null, XSDDatatype.XSDgYear); Triple pattern = new Triple(subject, predicate, object); // Query the graph Iterator it = g.find(pattern); // Output query results while (it.hasNext()) { Triple t = (Triple) it.next(); System.out.println("Published in 2003: " + t.getSubject()); }
In addition to the GraphD2RQ, there is a CachingGraphD2RQ which supports the same API and uses a LRU cache to remember a number of recent query results. This will improve performance for repeated queries, but will report inconsistent results if the database is updated during the lifetime of the CachingGraphD2RQ.
D2RQ can answer SPARQL queries against a D2RQ model. The SPARQL queries are processed by Jena's ARQ query engine. The example shows how a D2RQ model is set up, how a SPARQL query is executed, and how the results are written to the console.
ModelD2RQ m = new ModelD2RQ("file:doc/example/mapping-iswc.n3"); String sparql = "PREFIX dc: <http://purl.org/dc/elements/1.1/>" + "PREFIX foaf: <http://xmlns.com/foaf/0.1/>" + "SELECT ?paperTitle ?authorName WHERE {" + " ?paper dc:title ?paperTitle . " + " ?paper dc:creator ?author ." + " ?author foaf:name ?authorName ." + "}"; Query q = QueryFactory.create(sparql); ResultSet rs = QueryExecutionFactory.create(q, m).execSelect(); while (rs.hasNext()) { QuerySolution row = rs.nextSolution(); System.out.println("Title: " + row.getLiteral("paperTitle").getString()); System.out.println("Author: " + row.getLiteral("authorName").getString()); }
D2RQ comes with a Jena assembler. Jena assembler specifications are RDF configuration files that describe how to construct a Jena model. For more information on Jena assemblers, see the Jena Assembler quickstart page.
The following example shows an assembler specification for a D2RQ model:
@prefix : <#> . @prefix ja: <http://jena.hpl.hp.com/2005/11/Assembler#> . @prefix d2rq: <http://www.wiwiss.fu-berlin.de/suhl/bizer/D2RQ/0.1#> . <> ja:imports d2rq: . :myModel a d2rq:D2RQModel; d2rq:mappingFile <mapping-iswc.n3>; d2rq:resourceBaseURI <http://localhost:2020/>; .
D2RQ model specifications support these two properties:
This usage example will create a D2RQ model from a model specification, and write it to the console:
// Load assembler specification from file Model assemblerSpec = FileManager.get().loadModel("doc/example/assembler.n3"); // Get the model resource Resource modelSpec = assemblerSpec.createResource(assemblerSpec.expandPrefix(":myModel")); // Assemble a model Model m = Assembler.general.openModel(modelSpec); // Write it to System.out m.write(System.out);
Some of the D2RQ unit tests are using the ISWC example database from the /doc/example directory of the D2RQ distribution. To run the tests:
This section describes how the D2RQ Engine is used within the Sesame 1.2 RDF API.
Download
You have to download the following packages:
Installation
You have to add the "d2rq.jar" and "d2rq-to-sesame.jar" files from the "bin" directory of the D2RQ distribution together with the Jena 2 and Sesame 1.2 jar files to your classpath. To run D2RQ only the jar files
The following example shows how RDQL is used to get all information about the paper with the URI "http://www.conference.org/conf02004/paper#Paper1" out of a D2RQRepository.
import de.fuberlin.wiwiss.d2rq.sesame.D2RQRepository; import de.fuberlin.wiwiss.d2rq.sesame.D2RQSource; import org.openrdf.model.Value; import org.openrdf.sesame.Sesame; import org.openrdf.sesame.constants.QueryLanguage; import org.openrdf.sesame.query.QueryResultsTable; import org.openrdf.sesame.repository.SesameRepository; ... try{ // Initialize repository D2RQSource source = new D2RQSource("file:///where/you/stored/the/d2rq-mapping.n3", "N3"); SesameRepository repos = new D2RQRepository("urn:youRepository", source, Sesame.getService()); // Query the repository String query = "SELECT ?x, ?y WHERE (<http://www.conference.org/conf02004/paper#Paper1>, ?x, ?y)"; QueryResultsTable result = repos.performTableQuery(QueryLanguage.RDQL, query); // print the result int rows = result.getRowCount(); int cols = result.getColumnCount(); for(int i = 0; i < rows; i++){ for(int j = 0; j < cols; j++){ Value v = result.getValue(i,j); System.out.print(v.toString() + " "); } System.out.println(); } } catch(Exception e){ // catches D2RQException from D2RQSource construcor // catches java.io.IOException, // org.openrdf.sesame.query.MalformedQueryException, // org.openrdf.sesame.query.QueryEvaluationException, // org.openrdf.sesame.config.AccessDeniedException // from performTableQuery e.printStackTrace(); }
The D2RQ mapping language is a declarative language for describing the relation between a relational database schemata and RDFS vocabularies or OWL ontologies. A D2RQ map is an RDF document.
The language is formally defined by the D2RQ RDFS Schema.
The D2RQ namespace is http://www.wiwiss.fu-berlin.de/suhl/bizer/D2RQ/0.1#
An ontology is mapped to a database schema using d2rq:ClassMaps and d2rq:PropertyBridges. The central object within D2RQ and also the object to start with when writing a new D2RQ map is the ClassMap. A ClassMap represents a class or a group of similar classes of the ontology. A ClassMap specifies how instances of the class are identified. It has a set of PropertyBridges, which specify how the properties of an instance are created.
The figure below shows the structure of an example D2RQ map:
The following example D2RQ map relates the table conferences in a database to the class conference in an ontology. You can use the map as a template for writing your own maps.
# D2RQ Namespace @prefix d2rq: <http://www.wiwiss.fu-berlin.de/suhl/bizer/D2RQ/0.1#> . # Namespace of the ontology @prefix : <http://annotation.semanticweb.org/iswc/iswc.daml#> . # Namespace of the mapping file; does not appear in mapped data @prefix map: <file:///Users/d2r/example.n3#> . # Other namespaces @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . map:Database1 a d2rq:Database; d2rq:jdbcDSN "jdbc:mysql://localhost/iswc"; d2rq:jdbcDriver "com.mysql.jdbc.Driver"; d2rq:username "user"; d2rq:password "password"; . # ----------------------------------------------- # CREATE TABLE Conferences (ConfID int, Name text, Location text); map:Conference a d2rq:ClassMap; d2rq:dataStorage map:Database1. d2rq:class :Conference; d2rq:uriPattern "http://conferences.org/comp/confno@@Conferences.ConfID@@"; . map:eventTitle a d2rq:PropertyBridge; d2rq:belongsToClassMap map:Conference; d2rq:property :eventTitle; d2rq:column "Conferences.Name"; d2rq:datatype xsd:string; . map:location a d2rq:PropertyBridge; d2rq:belongsToClassMap map:Conference; d2rq:property :location; d2rq:column "Conferences.Location"; d2rq:datatype xsd:string; .
The constructs of the D2RQ mapping language are described in detail below.
A d2rq:Database defines a JDBC or ODBC connection to a local relational database and specifies the type of the database columns used by D2RQ. A D2RQ map can contain several d2rq:Databases for accessing different local databases.
Properties
d2rq:jdbcDSN | The JDBC database URL. This is a string of the form jdbc:subprotocol:subname. For a MySQL database, this is something like jdbc:mysql://hostname:port/dbname. Examples for other databases |
d2rq:jdbcDriver | The JDBC driver class name for the database. Used together with d2rq:jdbcDSN. Example: com.mysql.jdbc.Driver for MySQL. |
d2rq:odbcDSN | The ODBC data source name of the database. |
d2rq:username | A username if required by the database. |
d2rq:password | A password if required by the database. |
d2rq:resultSizeLimit | An integer value that will be added as a LIMIT clause to all generated SQL queries. This sets an upper bound for the number of results returned from large databases. Note that this effectively “cripples” the server and can cause unpredictable results. Also see d2rq:limit and d2rq:limitInverse, which may be used to impose result limits on individual property bridges. |
d2rq:fetchSize | An integer value that specifies the number of rows to retrieve with every database request. This value is particularily important to control memory resources of both the D2RQ and the database server when performing dumps. dump-rdf sets this value to 500 by default, or to Integer.MIN_VALUE for MySQL in order to enable streaming mode. |
d2rq:allowDistinct | Specifies the databases ability to handle DISTINCT correctly. Value: "true" or "false". For example MSAccess cuts fields longer than 256 chars. |
d2rq:textColumn d2rq:numericColumn d2rq:dateColumn d2rq:timestampColumn |
These properties are used to declare the column type of database columns. Values are column names in Table_name.column_name notation. These properties do not need to be specified unless the engine is for some reason unable to determine the correct column type by itself. The d2rq:timestampColumn is for column types that combine a date and a time. |
Example
map:Database1 a d2rq:Database; d2rq:jdbcDSN "jdbc:mysql://localhost/iswc"; d2rq:jdbcDriver "com.mysql.jdbc.Driver"; d2rq:username "user"; d2rq:password "password"; d2rq:numericColumn "Conferences.ConfID"; d2rq:textColumn "Conferences.URI"; d2rq:textColumn "Conferences.Name"; d2rq:textColumn "Conferences.Location"; d2rq:dateColumn "Conferences.Date".
Specifying JDBC connection properties
Most JDBC drivers offer a range of JDBC connection properties, which specify advanced configuration options for the JDBC database connection. A D2RQ mapping file can be made to use arbitrary connection properties when setting up the JDBC connection. This is done through the jdbc: namespace (namespace URI: http://d2rq.org/terms/jdbc/). RDF properties in that namespace will be passed as connection properties. Consult your JDBC driver's documentation for a list of available properties.
@prefix jdbc: <http://d2rq.org/terms/jdbc/> . map:database a d2rq:Database; # ... other database configuration ... jdbc:autoReconnect "true"; jdbc:zeroDateTimeBehavior "convertToNull"; .
The example uses two connection properties which are understood by the MySQL JDBC driver: autoReconnect=true and zeroDateTimeBehavior=convertToNull.
Keep-alive long-term connections
Some database servers like MySQL may terminate open client connections after some interval (MySQL default is 8 hours). To keep alive long-term connections, D2R can be configured to periodically run "noop" queries. This feature can be enabled with the special property jdbc:keepAlive. An example is given below:
@prefix jdbc: <http://d2rq.org/terms/jdbc/> . map:database a d2rq:Database; # ... other database configuration ... jdbc:keepAlive "3600"; # value in seconds jdbc:keepAliveQuery "SELECT 1"; # (optionally to override default noop query) .
By default the noop query is "SELECT 1", which may not work with some DBMS. For this purpose, the default query may be overridden with a custom noop query.
A d2rq:ClassMap represents a class or a group of similar classes of an OWL ontology or RDFS schema. A class map defines how instances of the class are identified. It is connected to a d2rq:Database and has a set of d2rq:PropertyBridges which attach properties to the instances.
D2RQ provides four different mechanisms of assigning identifiers to the instances in the database:
A URI pattern is instantiated by inserting values of certain database columns into a pattern. Examples:
http://example.org/persons/@@Persons.ID@@ http://example.org/lineItems/item@@Orders.orderID@@-@@LineItems.itemID@@ urn:isbn:@@Books.isbn@@ mailto:@@Persons.email@@
The parts between @@'s mark database columns in Table.Column notation. URI patterns are used with the d2rq:uriPattern property.
Certain characters, like spaces or the hash sign, are not allowed in URIs or have special meaning. Columns that contain such characters need to be encoded before their values can be inserted into a URI pattern:
A relative URI pattern is a URI pattern that generates relative URIs:
persons/@@Persons.ID@@
They will be combined with a base URI provided by the processing environment to form full URIs. Relative URI patterns allow the creation of portable mappings that can be used for multiple instances of the same database schema. Relative URI patterns are generated with the d2rq:uriPattern property.
In some cases, the database may already contain URIs that can be used as resource identifiers, such as web page and document URLs. URI are generated from columns with the d2rq:uriColumns property.
RDF also has the concept of blank nodes, existential qualifiers that denote some resource that exists and has certain properties, but is not named. In D2RQ, blank nodes can be generated from one or more columns. A distinct blank node will be generated for each distinct set of values of these columns. The columns are specified using the with the d2rq:bNodeIdColumns property.
A d2rq:ClassMap usually produces many resources. Sometimes it is desirable to have a class map that only produces a single resource with fixed, static identity. In that case, one can use the d2rq:constantValue property to name the single instance.
d2rq:dataStorage | Reference to a d2rq:Database where the instance data is stored. |
d2rq:class | An RDF-S or OWL class. All resources generated by this ClassMap are instances of this class. |
d2rq:uriPattern | Specifies a URI pattern that will be used to identify instances of this class map. |
d2rq:uriColumn | A database column containing URIrefs for identifying instances of this class map. The column name has to be in the form "TableName.ColumnName". |
d2rq:bNodeIdColumns | A comma-seperated list of column names in "TableName.ColumnName" notation. The instances of this class map will be blank nodes, one distinct blank node per distinct tuple of these columns. |
d2rq:constantValue | This class map will only have a single instance, which is named by the value of this property. This can be a blank node or a URI. |
d2rq:translateWith | Assigns a d2rq:TranslationTable to the class map. Values from the d2rq:uriColumn or d2rq:uriPattern will be translated by the table before a resource is generated. See below for details. |
d2rq:containsDuplicates | Must be specified if a class map uses information from tables that are not fully normalized. If the d2rq:containsDuplicates property value is set to "true", then D2RQ adds a DISTINCT clause to all queries using this classMap. "False" is the default value, which doesn't have to be explicitly declared. Adding this property to class maps based on normalized database tables degrades query performance, but doesn't affect query results. |
d2rq:additionalProperty | Adds an AdditionalProperty to all instances of this class. This might be useful for adding rdfs:seeAlso properties or other fixed statements to all instances of the class. |
d2rq:condition | Specifies an SQL WHERE condition. An instance of this class will only be generated for database rows that satisfy the condition. Conditions can be used to hide parts of the database from D2RQ, e.g. deny access to data which is older or newer than a certain date. See section Conditional Mappings for details. |
d2rq:classDefinitionLabel | Specifies a label that will be served as rdfs:label for all associated class definitions. Multiple lables, e.g. in several languages, are supported. |
d2rq:classDefinitionComment | Specifies a comment that will be served as rdfs:comment for all associated class definitions. Multiple comments are supported. |
d2rq:additionalClassDefinitionProperty | Adds an AdditionalProperty to all associated class definitions. |
ClassMap property:
d2rq:classMap | Inverse of d2rq:class and unnecessary if d2rq:class is used. Specifies that a d2rq:classMap is used to create instances of an OWL or RDF-S class. |
Example: ClassMap where instances are identified using an URI pattern
map:PaperClassMap a d2rq:ClassMap; d2rq:uriPattern "http://www.conference.org/conf02004/paper#Paper@@Papers.PaperID@@"; d2rq:class :Paper; d2rq:classDefinitionLabel "paper"@en; d2rq:classDefinitionComment "A conference paper."@en; d2rq:dataStorage map:Database1.
The d2rq:class property is used to state that all resources generated by the d2rq:ClassMap are instances of an RDFS or OWL class. D2RQ automatically creates the necessary rdf:type triples.
Example: ClassMap where instances are identified using blank nodes
map:Topic a d2rq:ClassMap ; d2rq:bNodeIdColumns "Topics.TopicID" ; d2rq:class :Topic ; d2rq:classDefinitionLabel "topic"@en; d2rq:classDefinitionComment "A topic."@en; d2rq:dataStorage map:Database1 .
In order to recognize bNodes across several find() calls and to be able to map bNodes to instance data in the database, D2RQ encodes the classMap name together with the primary key values in the bNode label. The map above could produce the bNode label "http://www.example.org/dbserver01/db01#Topic@@6", where the number "6" is a primary key value and "http://www.example.org/dbserver01/db01#Topic" is the ClassMap name.
Example: ClassMap for a group of classes with the same properties
If you want to use one ClassMap for a group of classes with the same properties (like Person, Professor, Researcher, Student) that all come from the same table, you must create the rdf:type statements with an object property bridge instead of using d2rq:class.
map:PersonsClassMap a d2rq:ClassMap ; d2rq:uriColumn "Persons.URI" ; d2rq:dataStorage map:Database1 . map:PersonsType a d2rq:PropertyBridge ; d2rq:property rdf:type ; d2rq:pattern "http://annotation.semanticweb.org/iswc/iswc.daml#@@Persons.Type@@" ; d2rq:belongsToClassMap map:PersonsClassMap .
Here, the class of each person is obtained by prefixing the values of the Persons.Type column with an ontology namespace. If the class names within the ontology can't be constructed directly from values of the Persons.Type column, then a TranslationTable could be used for aligning class names and database values.
Property Bridges relate database table columns to RDF properties. They are used to attach properties to the RDF resources created by a class map. The values of these properties are often literals, but can also be URIs or blank nodes that relate the resource to other resources, e.g. the value of a paper's :author property could be a URI representing a person.
If the one of the columns used in a property bridge is NULL for some database rows, then no property is created for the resources corresponding to these rows.
Properties
d2rq:belongsToClassMap | Specifies that the property bridge belongs to a d2rq:ClassMap. Must be specified for every property bridge. |
d2rq:property | The RDF property that connects the ClassMap with the object or literal created by the bridge. Must be specified for every property bridge. If multiple d2rq:properties are specified, then one triple with each property is generated per resource. |
d2rq:dynamicProperty | A URI pattern that is used to generate the property URI at runtime. If multiple d2rq:dynamicProperty are specified, then one triple with each property is generated per resource. |
d2rq:column | For properties with literal values. The database column that contains the literal values. Column names have to be given in the form "TableName.ColumnName". |
d2rq:pattern | For properties with literal values. Can be used to extend and combine column values before they are used as a literal property value. If a pattern contains more than one column, then a separating string, which cannot occur in the column values, has to be used between the column names, in order to allow D2RQ reversing given literals into column values. |
d2rq:sqlExpression | For properties with literal values. Generates literal values by evaluating a SQL expression. Note that querying for such a computed value might put a heavy load on the database. See example below. |
d2rq:datatype | For properties with literal values. Specifies the RDF datatype of the literals. |
d2rq:lang | For properties with literal values. Specifies the language tag of the literals. |
d2rq:uriColumn | For properties with URI values. Database column that contains URIs. Column names have to be given in the form "TableName.ColumnName". |
d2rq:uriPattern | For properties with URI values. Can be used to extend and combine column values before they are used as a URI property values. If a pattern contains more than one column, then a separating string, which cannot occur in the column values, has to be used between the column names, in order to allow D2RQ reversing given literals into column values. See example below. |
d2rq:uriSqlExpression | For properties with URI values and similar to d2rq:sqlExpression. Generates URIs by evaluating an SQL expression (the output must be a valid URI). Note that querying for such a computed value might put a heavy load on the database. See example below. |
d2rq:refersToClassMap | For properties that correspond to a foreign key. References another d2rq:ClassMap that creates the instances which are used as the values of this bridge. One or more d2rq:join properties must be specified to select the correct instances. See example below. |
d2rq:constantValue | For properties that have the same constant value on all instances of the class map. The value can be a literal, blank node, or URI. See example below. |
d2rq:join | If the columns used to create the literal value or object are not from the database table(s) that contains the ClassMap's columns, then the tables have to be joined together using one or more d2rq:join properties. See example below. |
d2rq:alias | Aliases take the form "Table AS Alias" and are used when a table needs to be joined to itself. The table can be referred to using the alias within the property bridge. See example below. |
d2rq:condition | Specifies an SQL WHERE condition. The bridge will only generate a statement if the condition holds. A common usage is to suppress triples with empty literal values: d2rq:condition "Table.Column <> ''". See section Conditional Mappings for details. |
d2rq:translateWith | Assigns a d2rq:TranslationTable to the property bridge. Values from the d2rq:column or d2rq:pattern will be translated by the table. See section TranslationTables for details. |
d2rq:valueMaxLength | Asserts that all values of this bridge are not longer than a number of characters. This allows D2RQ to speed up queries. See section Performance Optimization for details. |
d2rq:valueContains | Asserts that all values of this bridge always contain a given string. This allows D2RQ to speed up queries. Most useful in conjunction with d2rq:column. See section Performance Optimization for details. |
d2rq:valueRegex | Asserts that all values of this bridge match a given regular expression. This allows D2RQ to speed up queries. Most useful in conjunction with d2rq:column on columns whose values are very different from other columns in the database. See section Performance Optimization for details. |
d2rq:propertyDefinitionLabel | Specifies a label that will be served as rdfs:label for all associated property definitions. Multiple lables, e.g. in several languages, are supported. |
d2rq:propertyDefinitionComment | Specifies a comment that will be served as rdfs:comment for all associated property definitions. Multiple comments are supported. |
d2rq:additionalPropertyDefinitionProperty | Adds an AdditionalProperty to all associated property definitions. |
d2rq:limit | The maximum number of results to retrieve from the database for this PropertyBridge. Also see d2rq:resultSizeLimit. |
d2rq:limitInverse | The maximum number of results to retrieve from the database for the inverse statements for this PropertyBridge. |
d2rq:orderAsc | The column after which to sort results in ascending order for this PropertyBridge. |
d2rq:orderDesc | The column after which to sort results in descending order for this PropertyBridge. |
PropertyBridge property:
d2rq:propertyBridge | Inverse of d2rq:property and not needed if d2rq:property is used. The d2rq:propertyBridge property specifies which property bridge is used for an RDF property. If the same RDF property is used by several RDF classes, then several property bridges are used to relate the RDF property to the different class maps. |
Example: A simple property bridge
map:PaperTitle a d2rq:PropertyBridge; d2rq:belongsToClassMap map:Paper; d2rq:property :title; d2rq:column "Papers.Title"; d2rq:lang "en"; d2rq:propertyDefinitionLabel "title"@en; d2rq:propertyDefinitionComment "A paper's title."@en; .
This attaches a :title property to all resources generated by the map:Paper class map. The property values are taken from the Papers.Title column. The generated literals will have a language tag of "en".
Example: Property bridge using information from another database table
map:authorName a d2rq:PropertyBridge; d2rq:belongsToClassMap map:Papers; d2rq:property :authorName; d2rq:column "Persons.Name"; d2rq:join "Papers.PaperID <= Rel_Person_Paper.PaperID"; d2rq:join "Rel_Person_Paper.PersonID => Persons.PerID"; d2rq:datatype xsd:string; d2rq:propertyDefinitionLabel "name"@en; d2rq:propertyDefinitionComment "Name of an author."@en; .
This property bridge adds the names of authors to papers. If a paper has several authors, then several :authorName properties are added. The tables Papers and Persons are in an n:m relation. The d2rq:join is used to join the tables over the Rel_Person_Paper. The join condition contains directed arrows that indicate the foreign key relationship and are used as an optimization hint. In the example above, the arrow directions indicate that all possible values of Rel_Person_Paper.PaperID and Rel_Person_Paper.PersonID are present in Papers.PaperID and Persons.PerID, respectively. Where this is unclear, a simple equation sign (=) may be used.
Example: A property bridge with mailto: URIs
map:PersonsClassEmail a d2rq:PropertyBridge; d2rq:belongsToClassMap map:PersonsClassMap; d2rq:property :email; d2rq:uriPattern "mailto:@@Persons.Email@@"; .
The pattern mailto:@@Persons.Email@@ is used to attach a mailto: prefix to the values of the "Persons.Email" column. The example uses d2rq:uriPattern instead of d2rq:pattern because the bridge should produce URIs, not literals.
Example: A property bridge that computes mailbox hashes
The popular FOAF vocabulary has a property foaf:mbox_sha1sum for publishing email addresses in an encoded form. This prevents spammers from harvesting the address, while still letting us recognize if the same email address is used in two different places.
map:UserEmailSHA1 a d2rq:PropertyBridge; d2rq:belongsToClassMap map:User; d2rq:property foaf:mbox_sha1sum; d2rq:sqlExpression "SHA1(CONCAT('mailto:', user.email))"; .
The values of the foaf:mbox_sha1sum are computed by evaluating the d2rq:sqlExpression. We first create a mailto: URI from the email address, as required by FOAF. Then we apply the SHA1 hash function, again as required by FOAF. The result will be a literal value.
Note that querying for a specific foaf:mbox_sha1sum value will put a heavy load on the database because the hash has to be computed for every user in the database. For a large database, it would be better to store the encoded values in a column in the database.
Example: A property bridge with URIs generated by an SQL expression
map:HomepageURL a d2rq:PropertyBridge; d2rq:belongsToClassMap map:PersonsClassMap; d2rq:property foaf:homepage; d2rq:uriSqlExpression "CONCAT('http://www.company.com/homepages/', user.username)"; .
The pattern mailto:@@Persons.Email@@ is used to attach a mailto: prefix to the values of the "Persons.Email" column. The example uses d2rq:uriPattern instead of d2rq:pattern because the bridge should produce URIs, not literals.
Example: Linking instances from two database tables
map:PaperConference a d2rq:PropertyBridge; d2rq:belongsToClassMap map:Paper; d2rq:property :conference; d2rq:refersToClassMap map:Conference; d2rq:join "Papers.Conference => Conferences.ConfID"; .
The example attaches a :conference property to papers. The values of the property are generated by the map:Conference class map, not shown here. It may use a d2rq:uriPattern, d2rq:uriColumn or blank nodes to identify the conferences. The appropriate instance is found using the d2rq:join on the foreign key Papers.Conference.
Example: Joining a table to itself using d2rq:alias
map:ParentTopic a d2rq:PropertyBridge; d2rq:belongsToClassMap map:Topic; d2rq:property :parentTopic; d2rq:refersToClassMap map:Topic; d2rq:join "Topics.ParentID => ParentTopics.ID"; d2rq:alias "Topics AS ParentTopics"; .
Here, a topic may have a parent topic whose ID is found in the Topics.ParentID column. This foreign key refers back to the Topics.ID column. The table has to be joined to itself. A d2rq:alias is declared, and the join is established between the original table and the aliased table. This pattern is typical for hierarchical or graph-style relationships.
Example: Adding a constant property-value pair to each instance of a class map
Sometimes it is desirable to add a property with a constant value to every resource that is created by a class map. To achieve this, use a d2rq:PropertyBridge that uses d2rq:constantValue:
map:PersonsClassMap a d2rq:ClassMap; d2rq:class :Person; . map:seeAlsoBridge a d2rq:PropertyBridge; d2rq:belongsToClassMap map:PersonsClassMap; d2rq:property rdfs:seeAlso; d2rq:constantValue <http://annotation.semanticweb.org/iswc2003/>; .
This adds an rdfs:seeAlso statement with a fixed URL object to every instance of the map:PersonsClassMap class map.
A d2rq:TranslationTable is an additional layer between the database and the RDF world. It translates back and forth between values taken from the database and RDF URIs or literals. A translation table can be attached to a class map or a property bridge using the d2rq:translateWith property. TranslationTables can be used only for mappings that are unique in both directions (1:1).
Properties
d2rq:translation | Adds a d2rq:Translation to the table. |
d2rq:href | Links to a CSV file containing translations. Each line of the file is a translation and contains two strings separated by a comma. The first one is the DB value, the second the RDF value. |
d2rq:javaClass | The qualified name of a Java class that performs the mapping. The class must implement the Translator interface. Custom Translators might be useful for encoding and decoding values, but are limited to 1:1 translations. Further datails can be found in the D2RQ javadocs. |
A d2rq:Translation is a single entry in a d2rq:TranslationTable.
Properties
d2rq:databaseValue | A value that might appear in a database column or might be generated by a d2rq:pattern. |
d2rq:rdfValue | A translation of that value to be used in RDF constructs. |
Example: Translating color codes
A typical application are database columns containing type codes or similar enumerated values. A translation table can be used to turn them into RDF resources. In this example, the column ShinyObject.Color contains a color code: "R" for red, "G" for green etc. These codes must be translated into RDF resources: :red, :green etc.
:red a :Color; :green a :Color; # ... more colors omitted ... :blue a :Color; map:ColorBridge a d2rq:PropertyBridge; d2rq:belongsToClassMap map:ShinyObjectMap; d2rq:property :color; d2rq:uriColumn "ShinyObject.Color"; d2rq:translateWith map:ColorTable. map:ColorTable a d2rq:TranslationTable; d2rq:translation [ d2rq:databaseValue "R"; d2rq:rdfValue :red; ]; d2rq:translation [ d2rq:databaseValue "G"; d2rq:rdfValue :green; ]; # ... more translations omitted ... d2rq:translation [ d2rq:databaseValue "B"; d2rq:rdfValue :blue; ].
The d2rq:translateWith statement tells D2RQ to look up database values in the map:ColorTable. There, a translation must be given for each possible value. If the database contains values which are not in the translation table, D2RQ will not generate a :color statement for that :ShinyObject instance.
Note that the type of the resulting RDF node is determined by the bridge and not by the node type of the rdfValues. map:ColorBridge uses d2rq:uriColumn. Thus, the translation will create URI nodes. If it used d2rq:column, then literals would be created.
A d2rq:Configuration controls global behaviour of D2RQ. It is generally not required if the defaults are satisfactory.
Properties
d2rq:serveVocabulary | Whether to serve inferred and user-supplied vocabulary data (boolean; true by default). This option is automatically set when using D2R Server's --fast command-line option. |
d2rq:useAllOptimizations | Whether to use bleeding edge optimizations (boolean; false by default). |
Example: Activating optimizations
In order to activate bleeding edge optimizations, a d2rq:Configuration block with the property d2rq:useAllOptimizations set to true is created:
map:Configuration a d2rq:Configuration; d2rq:useAllOptimizations true.
Sometimes only certain information from a database should be accessible, because parts of the database might be confidential or outdated. Using d2rq:condition you can specify conditions, which must be satisfied by all accessible data.
You can use d2rq:condition on class map and property bridge level. The d2rq:condition value is added as an additional SQL WHERE clause to all queries generated using the class map or property bridge. If the condition evaluates to FALSE for a SQL result set row, then no triples will be generated from that row.
Example: Using d2rq:condition on a d2rq:ClassMap
map:Paper a d2rq:ClassMap; d2rq:class :Paper; d2rq:uriPattern "http://www.conference.org/conf02004/paper#Paper@@Papers.PaperID@@"; d2rq:condition "Papers.Publish = 1"; d2rq:dataStorage map:Database1.
Only those papers with a value of 1 in the Papers.Publish column will be accessible. All other papers are ignored.
Example: Filtering zero-length strings
Usually, the special value NULL is used in a database to indicate that some field has no value, or that the value is unknown. Some databases, however, are using a zero-length string ("") instead. D2RQ doesn't generate RDF statements from NULL values, but it doesn't recognize zero-length strings and will generate statements like :Person123 :firstName "". if the person's first name is a zero-length string. In oder to suppress these statements, a d2rq:condition can be added to the property bridge:
map:PersonsClassFirstName a d2rq:PropertyBridge; d2rq:property :firstName; d2rq:column "Persons.FirstName"; d2rq:belongsToClassMap map:PersonsClassMap; d2rq:condition "Persons.FirstName <> ''".
Example: Relationship type codes
Imagine a table Rel_Paper_Topic that associates rows from a Papers table with rows from a Topics table. The Rel_Paper_Topic table contains a PaperID column to reference the papers, a TopicID to reference the topics, and a RelationshipType column which contains 1 if the topic is a primary topic of the paper, and 2 if it is a secondary topic.
For primary topic relationships, the :primaryTopic property shall be used, and for others the :secondaryTopic property.
We can build a map for this scenario by creating two property bridges. One for :primaryTopic, one for :secondaryTopic. We add a d2rq:condition to both bridges to suppress those statements where the RelationshipType column doesn't have the correct value.
map:primaryTopic a d2rq:PropertyBridge; d2rq:belongsToClassMap map:Paper; d2rq:property :primaryTopic; d2rq:refersToClassMap map:Topic; d2rq:join "Papers.PaperID <= Rel_Paper_Topic.PaperID"; d2rq:join "Rel_Paper_Topic.TopicID => Topics.TopicID"; d2rq:condition "Rel_Paper_Topic.RelationType = 1". map:secondaryTopic a d2rq:PropertyBridge; d2rq:belongsToClassMap map:Paper; d2rq:property :secondaryTopic; d2rq:refersToClassMap map:Topic; d2rq:join "Papers.PaperID <= Rel_Paper_Topic.PaperID"; d2rq:join "Rel_Paper_Topic.TopicID => Topics.TopicID"; d2rq:condition "Rel_Paper_Topic.RelationType = 2".
In the spirit of Linked Data, vocabulary data should be dereferencable by clients. D2RQ infers types of class and property resources as rdfs:Class and rdf:Property and allows the user to provide labels using the d2rq:classDefinitionLabel and d2rq:propertyDefinitionLabel constructs, comments using the d2rq:classDefinitionComment and d2rq:propertyDefinitionComment constructs, as well as additional properties using the d2rq:additionalClassDefinitionProperty and d2rq:additionalPropertyDefinitionProperty constructs.
This feature is meant to enable Linked Data interfaces by providing additional query results for simple (URI, ANY, ANY) or (ANY, ANY, URI) find patterns that touch on vocabulary resources. It currently does not work within SPARQL queries beyond DESCRIBE.
A d2rq:AdditionalProperty construct can be used to add a fixed statement to all class definitions of a class map, or to all property definitions of a property bridge. The statement is added to the result sets, if patterns like (ANY, ANY, ANY), (URI, ANY, ANY) or (URI, additionalPropertyName, ANY) are used. The usage of d2rq:AdditionalProperty to add instance data is now deprecated (details). The d2rq:additionalClassDefinitionProperty and d2rq:additionalPropertyDefinitionProperty properties are used to link from the class map or property bridge to the d2rq:AdditionalProperty definition.
Properties
d2rq:propertyName | The RDF property to be used as the predicate of all fixed statements. |
d2rq:propertyValue | The value to be used as the object of all fixed statements. |
Example: Providing an additional property for a class definition
map:PersonsClassMap a d2rq:ClassMap; d2rq:class :Person; d2rq:additionalClassDefinitionProperty map:PersonEquivalence. map:PersonEquivalence a d2rq:AdditionalProperty; d2rq:propertyName owl:equivalentClass; d2rq:propertyValue foaf:Person.
This adds an owl:equivalentClass statement with the fixed object foaf:Person to every related class definition.
Example: Providing an additional property for a property definition
map:PaperTitle a d2rq:PropertyBridge; d2rq:belongsToClassMap map:Paper; d2rq:property :title; d2rq:column "Papers.Title"; d2rq:additionalPropertyDefinitionProperty map:PaperTitleEquivalence. map:PaperTitleEquivalence a d2rq:AdditionalProperty; d2rq:propertyName owl:equivalentProperty; d2rq:propertyValue dc:title.
This adds an owl:equivalentProperty statement with the fixed object dc:title to every related property definition.
D2R Server automatically serves data for vocabularies placed under http://baseURI/vocab/resource/. The mapping generator automatically creates a compatible namespace for this purpose. For further details, please refer to the D2R documentation.
Vocabulary serving is enabled by default. In order to deactivate it, a d2rq:Configuration block with the property d2rq:serveVocabulary set to false must be created:
Example: Deactivating vocabulary serving
map:Configuration a d2rq:Configuration; d2rq:serveVocabulary false.
This section covers hint properties that can be added to property bridges in order to speed up queries: d2rq:valueMaxLength, d2rq:valueRegex and d2rq:valueContains.
Example: Providing a maximum length
map:PersonsClassFirstName a d2rq:PropertyBridge; d2rq:property :firstName; d2rq:column "Persons.FirstName"; d2rq:belongsToClassMap map:PersonsClassMap; d2rq:valueMaxLength "15".
The d2rq:valueMaxLength property can be used to tell D2RQ that the length of Persons.FirstName values is limited to 15 characters. Using this information, D2RQ doesn't have to look in the database anymore to figure out, that a given FirstName which is longer than 15 characters isn't fitting.
Example: Providing a regular expression
map:PaperYear a d2rq:PropertyBridge; d2rq:property :year; d2rq:column "Papers.Year"; d2rq:belongsToClassMap map:Paper; d2rq:datatype xsd:gYear; d2rq:valueRegex "^[0-9]{4}$".
Here, the d2rq:valueRegex property is used to provide a regular expression for the Papers.Year column. The statement asserts that all values match the regular expression (or are NULL). The expression ^[0-9]{4}$ matches every four-digit number. If you don't want to use the full regular expression machinery, you can use d2rq:valueContains to assert that all values generated by the property bridge contain a certain phrase.
You are geting the largest performance gain by providing hints for property bridges which are using d2rq:column. You should define hints on columns of large tables and on columns that are not indexed by the database. These are the cases where a well-placed optimization hint can result in an order-of-magnitude improvement for some queries. Don't bother to provide hints for property bridges based on d2rq:pattern. These can be optimized very well without hints. In general, the biggest payoff is expected for hints on large tables. If you have a few very large tables with non-indexed columns in your database, that's where you should focus your efforts.
Please keep in mind that hint properties are not intended for filtering of unwanted database values. They are only performance hints. Values that do not fulfill the criteria will still appear in query results if find patterns like (URI, ANY, ANY) are used. In oder to filter values, use d2rq:condition or a translation table with a custom Java class that returns null for unwanted database values.
This section lists several language constructs from older versions of the D2RQ mapping language that have been replaced by better alternatives and should no longer be used.
Older versions of the language used two different classes to distinguish between property bridges that produce literals, and bridges that produce resources.
In the current version, both cases are handled by the d2rq:PropertyBridge class. The distinction is made by using an appropriate property on the bridge declaration: d2rq:column and d2rq:pattern for literals, d2rq:uriColumn, d2rq:uriPattern and d2rq:bNodeIdColumns for resources.
Up until D2RQ 0.5.1, the d2rq:AdditionalProperty construct could be used to add a constant property-value pairs to all instances of a class map. An example is shown below:
map:PersonsClassMap a d2rq:ClassMap; d2rq:class :Person; d2rq:additionalProperty map:SeeAlsoStatement. map:SeeAlsoStatement a d2rq:AdditionalProperty; d2rq:propertyName rdfs:seeAlso; d2rq:propertyValue <http://annotation.semanticweb.org/iswc2003/>.
This adds an rdfs:seeAlso statement with a fixed URL object to every instance of the persons class map. In recent versions of the mapping language, the same is achieved by adding a property bridge to the class map, and giving it a d2rq:constantValue property with the fixed URL as the object, as shown in this example.
d2rq:AdditionalProperty constructs are still used with d2rq:additionalClassDefinitionProperty and d2rq:additionalPropertyDefinitionProperty (see section 7.4.1).
$Id: index.htm,v 1.55 2009/10/02 10:51:58 fatorange Exp $