Very fast (and small) guide to using OWL ART API

Introduction

OWL ART API provide a middle layer for RDF Management offering an abstraction over different triple store technologies. There are two main interaction layers which can be accessed through the library API:

  1. Base RDF Triple layer
  2. Specific Vocabulary Management

The base triple layer provides basic manipulation and querying of RDF triples, SPARQL access and RDF Nodes management. This is the layer which needs to be implemented when developing an OWL ART wrapper for a specific triple store technology.

The Vocabulary Management layers provide higher level API with convenience methods specifically built for the various vocabularies of the RDF family (RDF, RDFS, OWL, SKOS, SKOS-XL).

In this page we provide basic information for setting up a project which uses the OWLART libraries, for loading an RDFmodel, and for accessing/manipulating RDF through the OWLART API.

We have provided some other pages with information about specific aspects of OWLART.

  1. utilities: here we report about a few utilities we have embedded inside the OWLART library, which can be run through command line

Setup

The OWL ART site provides a downloads page for downloading all the required libraries. Normally, the OWL ART core library and an additional implementation library (such as the implementation for Sesame2) are needed to interact with RDF. The dipendencies of each library can be checked on its related Project Object Model (POM) file (e.g. this is a link to the POM.XML file of the core OWL ART library).

A much better alternative is to use Maven. If your project is managed by Maven, then you just need to import one implementation library, and Maven resolves automatically all the dependencies.

OWL ART Project is mirrored on Maven Central, so there is no need of importing any library nor linking explicitly any Maven repository. This is an example of a declaration (to be put in the appropriate dependencies section of the pom.xml file of your project) for importing Sesame2 implementation of OWL ART:

<dependency> 
    <groupId>it.uniroma2.art.owlart</groupId> 
    <artifactId>owlart-sesame2impl</artifactId> 
    <version>1.1</version> 
    <scope>compile</scope> 
</dependency> 

The above code imports version 1.1 (check current version as this example could be obsolete when you read it) of Sesame2 implementation of OWL ART API. Maven will automatically import the library from its central repository and dynamically look for all needed dependencies.

Loading a Model

OWL ART have the concept of RDF Model, a queriable and manageable container for RDF triples

There are two main procedures for creating a model:

  1. completely by API
  2. by means of configuration files, using the OWLARTModelLoader facility class (available from OWLART 2.1)

Loading a Model through API

To create an RDF model through the API two steps are required:

  1. creating a model factory for a specific triple store
  2. creating the model

There exists a general ModelFactory called OWLArtModelFactory in the OWL ART API, managing all the required initialization steps, and which can be created by passing as an argument to the constructor, a ModelFactory for an available triple store implementation

This is the code to create it:

// creates the specific ModelFactory for Sesame2 implementation
ARTModelFactorySesame2Impl factImpl = new ARTModelFactorySesame2Impl();
// creates the OWL ART Model Factory by wrapping the Sesame2 factory
OWLArtModelFactory<? extends ModelConfiguration> fact = OWLArtModelFactory.createModelFactory(factImpl);

the code for creating an RDF model is pretty easy, just one single row of code:

// the method for creating a model needs two arguments: the baseuri of the repository being created and a directory inside the local filesystem.
model = fact.loadRDFModel("http://art.uniroma2.it/ontologies/friends", BaseRDFModelTest.testRepoFolder);

the method for creating a model needs two arguments: the baseuri of the repository being created and a directory inside the local filesystem. The role of the second argument is totally open, depending on the specific implementation being used and on the kind of model which is being loaded. The directory provides cache/store room for management of the RDF repository: in some cases it may contain several kind of information, in others it may even be useless.

Obviously, different models can be created, and different technologies may give the possibility to configure them in different ways etc... we'll talk about this in the next section

Configurations

Every ModelFactory implementation may provide different kind of backing configurations for running an RDF Triple Model. For instance, different configurations may allow for the creation of persistent vs non persistent models, creation of local RDF repository or access to remote ones etc... also, each of these configurations may be further customized through specific initialization parameters.

Every ModelFactory implementation has however one default ModelConfiguration which is being adopted when creating a Model, and each ModelConfiguration has a standard assignment for its parameters. This allows for short and easy code for standard configurations, while leaving room for strong customization possibilities.

In the following example, a ModelConfiguration is created and then two of its parameters are being set


// usual creation of a Sesame2 model factory  
ARTModelFactorySesame2Impl factImpl = new ARTModelFactorySesame2Impl(); 
 
// a model configuration is created for the Sesame2 implementation. Note that there are two contraints here: 
// first it has been created from a Sesame2 model factory 
// the configuration is a "parameters bundle" especially suited for "non persisting" "in-memory" sesame repositories. 
Sesame2ModelConfiguration modelConf = 
    factImpl.createModelConfigurationObject(Sesame2NonPersistentInMemoryModelConfiguration.class); 
 
// the specific class selected for the configuration has determined much of nature of the models which will be created 
// with that configuration, however the configuration maybe further customized by setting the following parameters 
modelConf.directTypeInference = false; 
modelConf.rdfsInference = false; 
 
// a factory is created in the usual way. Note that it is possible to contrain the factory to only 
// accept configurations of a given type. This is useful for static code to lessen chances of 
// configuration misuse. The java generics constraint is however erased at runtime 
OWLArtModelFactory<Sesame2ModelConfiguration> fact = OWLArtModelFactory.createModelFactory(factImpl); 
 
// the third argument here passed to the loadXXXModel method specifies that the configuration created above will 
// be used to determine the nature of the model being created 
model = fact.loadRDFSModel("http://art.uniroma2.it/ontologies/friends", BaseRDFModelTest.testRepoFolder, modelConf);

Different Models for Different RDF Vocabularies

As explained in the introduction, the basic interaction with an RDF graph is made by accessing and manipulating its triples. Basic functionalities for doing this are provided by the BaseRDFTripleModel interface. For each of the main RDF vocabularies from the W3C, there exist a relative model interface which extends the above BaseRDFTripleModel interface by providing convenience methods for interacting with that specific vocabulary.

Each ModelFactory provides by contract methods for initializing RDF Models of RDF, RDFS, OWL SKOS or SKOS-XL nature.

In the following example, an OWL Model is being created:


// the null argument on the folder can be used only if the specific model configuration does not require
// any information to be stored in mass-memory
model = fact.loadOWLModel("http://art.uniroma2.it/ontologies/friends", null);

Including W3C RDF Modeling Vocabularies

The family of modeling vocabularies of the RDF family (RDF, RDFS, OWL, OWL2, SKOS, SKOS-XL) may or may not be included (depending on the specific technology) in the model triples once a given model has been initialized. In some cases (mostly because of reasoning), it may be useful to load these vocabularies explicitly, and assign them to a given graph. This option is active by default, and can be queried, deactivated or reactivated through dedicated methods:


	/**
	 * instructs the factory to create graphs for the proper W3C vocabularies in the models which it creates,
	 * if they are not already available. An example of "proper" vocabularies is the list of {RDF, RDFS, OWL}
	 * if an {@link OWLModel} is being created. The graphs are also populated with triples from their 
	 * respective vocabulary
	 * 
	 * @param pref
	 */
	public void setPopulatingW3CVocabularies(boolean pref);

	/**
	 * tells if the factory has to create graphs for the proper W3C vocabularies in the models which it creates,
	 * if they are not already available. An example of "proper" vocabularies is the list of {RDF, RDFS, OWL}
	 * if an {@link OWLModel} is being created
	 */
	public boolean isPopulatingW3CVocabularies();

Loading a Model through Configuration Files

The class it.uniroma2.art.owlart.models.OWLARTModelLoader allows to create a model by defining its configuration in dedicated configuration files.

There are two configuration files which must be provided:

  1. the general configuration, which provides information about the OWLART implementation which is being used to create the model (obviously, the mentioned implementation library must be in the classpath)
  2. the specific implementation configuration file, containing parameters which depend on the technology adopted (e.g. Sesame2) and on the specific triple store/modality selected for that technology in the first file

Here follows an example of the general configuration file:

#ModelFactory Implementation: required
modelFactoryImplClassName=it.uniroma2.art.owlart.sesame2impl.factory.ARTModelFactorySesame2Impl

#modelConfigClass Implementation: not required
modelConfigClass=it.uniroma2.art.owlart.sesame2impl.models.conf.Sesame2NonPersistentInMemoryModelConfiguration

#modelConfigFile : not required in Sesame2, though we have to disable inference, which is set to true by default
modelConfigFile=workbench/config/model_sesame-noinference.config

modelDataDir=workbench/ModelDataSesame2

baseuri=http://aims.fao.org/aos/agrovoc/
namespace=http://aims.fao.org/aos/agrovoc/
defaultScheme=http://aims.fao.org/aos/agrovoc

from the file, it is clear that the implementation of OWLART based on the Sesame2 middleware is being used, and that a non persistent in memory is being loaded by specifyng the appropriate model configuration class. The file then provides the data directory for the loaded model and basic information such as baseuri, namespace and (in case the model is a SKOS model, the defaultScheme being considered).

The property modelConfigFile points to a second file, with the specific parameters for the chosen model configuration. Here is an example:

#inference settings are set to true by default. Better have them to false for conversion
directTypeInference=false
rdfsInference=false

SPARQL over HTTP

An OWL ART Model has also interfaces for being accessed via SPARQL. It is also possible to use the OWL ART API for connecting to remote SPARQL endpoints via HTTP. The ModelFactory interface provides a method for this: loadTripleQueryHTTPConnection


String dbpediaEndpointURL = "http://dbpedia.org/sparql";
TripleQueryModelHTTPConnection conn = modFact.loadTripleQueryHTTPConnection(dbpediaEndpointURL);

In the following example, the remote SPARQL endpoint of DBPedia is accessed to retrieve data from it. A tuple query object collects bindings for the variables expressed in the SELECT part of the query


String dbpediaEndpointURL = "http://dbpedia.org/sparql"; 
TripleQueryModelHTTPConnection conn = modFact.loadTripleQueryHTTPConnection(dbpediaEndpointURL); 
            TupleQuery query = conn 
                    .createTupleQuery( 
                            QueryLanguage.SPARQL, 
                            "select distinct ?site where {?site a <http://dbpedia.org/class/yago/AncientGreekSitesInSpain>}", 
                            "http://dbpedia.org/sparql/"); 
            TupleBindingsIterator it = query.evaluate(false); 
            while (it.streamOpen()) { 
                System.out.println(it.getNext()); 
            } 

TripleModel over SPARQL (over HTTP)

In this further example, DBPedia is accessed via SPARQL in the usual way, but a further interface is built over the SPARQL HTTP Connection: a SPARQLBasedRDFTripleModel

TripleQueryModelHTTPConnection conn = modFact.loadTripleQueryHTTPConnection(dbpediaEndpointURL); 
SPARQLBasedRDFTripleModelImpl model = new SPARQLBasedRDFTripleModelImpl(conn); 
 
ARTResourceIterator it = model.listSubjectsOfPredObjPair(RDF.Res.TYPE, model.createURIResource("http://dbpedia.org/class/yago/AncientGreekSitesInSpain"), false);             
 
while (it.streamOpen()) { 
    System.out.println(it.getNext()); 
} 
 
// QUERY AUTOMATICALLY COMPOSED FROM THE METHOD INVOCATION: 
// query: SELECT ?s ?p ?o {?s <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://dbpedia.org/class/yago/AncientGreekSitesInSpain> } 
 
// RESULT SHOWN ON CONSOLE: 
// http://dbpedia.org/resource/Cadiz 
// http://dbpedia.org/resource/Emp%C3%BAries 
// http://dbpedia.org/resource/Alicante 
// http://dbpedia.org/resource/Lucentum 

The code above allows for easy retrieval of RDF Resources by using advanced methods for RDF, RDFS, OWL and SKOS(XL) vocabularies.
Thus, instead of retrieving tuple bindings from SPARQL, the classical Model methods can directly retrieve Statements or even URI, Blank Nodes and Literals via SPARQL-on-HTTP

Easy creation of a Lightweight Model

By use of the method createLightweightRDFModel() it is possible to create a lightweight module, which has the characteristics of being: in memory, not persisted to any storage form, not including any inference support, offering only basic RDF manipulation API. So, supposing to have a factory fact, creating such a model is extremely easy:

BaseRDFTripleModel model = fact.createLightweightRDFModel();

Querying a Model

Every Model can be queried mainly through two interfaces:

  1. through the Model API: a wide list of methods is directly available from the Model interface (or any of its subinterfaces, with specific methods for the implemented RDF vocabulary).
  2. through SPARQL: the Query interface (and it sub interfaces) allows to create SPARQL queries (actually, may support other languages, like SeRQL in the case of Sesame2 implementation) and to execute them in order to collect results and bind their elements to OWLART RDF resources

Using the Model API

Iteration over RDF resources

Most of the query methods available from the interfaces of the models package return iterators over RDF related objects (Statements, blank nodes, literals, URIs). For each of these types there are specific iterators, all of them implementing the generics RDFIterator<T> interface replacing T with their type. The RDFIterator<T> interface provides its own methods for rolling up the iterator, such as streamOpen(), to check if there are still results to iterate, getNext(), to get the next result, and close(), to release resources associated to the iteration over the accessed data. We have defined our own methods instead of those from the standard java.util.Iterator<E> interface because they need to thrown exceptions associated to the impossibility of accessing or modifying the underlying RDF Model. Here follows an example on the use of RDFIterators:



try {
    ARTURIResourceIterator it = ontModel.listNamedClasses(true);

    while (it.hasNext()) {
        ARTURIResource cls = it.next().asURIResource();
        System.out.println("named class: " + cls);
    }
    namedClassesIt.close();
} catch (ModelAccessException e) {
    ...
}

However, RDFIterator<T> is also compatilble with java.util.Iterator<E> so that all available libraries (such as google-collections or commons-collections) with utilities associated to standard iterators can be used also with them. The only drawback in using the standard iterator interface is that all access/update exceptions associated to the communication with the RDF repository will be thrown as runtime exceptions. For this reason, we provided some useful utilities over the specific RDFIterator<T> interface (and thus, all of its subinterfaces and implementing classes). These have been provided in the following utility class:

RDF Iterators Utilities

The RDFIterators class contains lot of useful utilities for managing iterators over RDF nodes. These include utilities for:

Filter Package

The filter package offers several predicates (compatible with the Google Collections' Predicate interface) for creating filtered iterators over RDFIterators. The RDFIterator interface, though offering dedicate methods throwing RDF access exceptions, is also compatible with the standard java.util.Iterator interface and can thus be reused on all libraries supporting it. Here follows an example on the usage of methods of the Iterators class of the Google Collections library, for creating filters over RDF iterators by using predicates provided by the filter package of OWL ART library


Predicate<ARTResource> exclusionPredicate;

exclusionPredicate = NoLanguageResourcePredicate.nlrPredicate;

Predicate<ARTResource> rootUserClsPred = Predicates.and(new RootClassesResourcePredicate(ontModel),
        exclusionPredicate);
try {
    ARTURIResourceIterator namedClassesIt = ontModel.listNamedClasses(true);
    Iterator<ARTURIResource> filtIt;
    filtIt = Iterators.filter(namedClassesIt, rootUserClsPred);
    while (filtIt.hasNext()) {
        ARTURIResource cls = filtIt.next().asURIResource();
        System.out.println("named class: " + cls);
    }
    namedClassesIt.close();
} catch (ModelAccessException e) {
    ...
}

Using SPARQL

doc in progress