The Sheet2RDF VocBench tool

Sheet2RDF is also available as embedded VocBench 3 tool.

Installation

On the VocBench site, we provided detailed instructions for the installation of VocBench.

Sheet2RDF on VocBench

Once the installation is complete, you can start playing with VocBench. The user interface of Sheet2RDF consists of a tab, which can be opened by clicking on Sheetrdf under Tools navigation bar menu item.

Open

The UI is composed of three panels:

User Interface

Now, let's see how Sheet2RDF works.

The spreadsheet management

The input file (the spreadsheet to be processed) can be chosen and loaded through the Browse button. Once the input file is loaded, a preview of the datasheet is shown into the left panel. In the screenshot below, it is shown how is adopted a color code for the column headers status.

Datasheet preview

The spreadsheet columns need to be configured in order to guide the triplification process. This configuration will guide the automatic generation of the PEARL code which can be triggered by clicking the "play" button. Note that beside this button, two further buttons are available. These allows to export the current configuration so that it could be imported, edited and reused at a later time.

In order to edit the header configuration, a dedicated dialog can be opened by clicking on the small icon near the column name.

The header editor

In order to better understand the usage of the header editor, it's important to clarify that the PEARL code (which is automatically produced according the configuration of the headers) is the real "brain" of the triplification process. The header editor is just a tool that simplifies the writing of the PEARL code, or even completely relieves the user from the task. In order to completely understand the PEARL language it is recommended to read its documentation, but for the use made of it in Sheet2RDF it is sufficient to know that the PEARL that it is going to be generated is composed essentially by two blocks: nodes and graph. Summarizing briefely:

So, practically, the header editor allows to define a set of nodes and their application in the graph section (named graph applications).
It is possible to completely exclude an header from the process of generation of the PEARL, simply by checking the ignore header checkbox.

As can be seen in the figure above, the header editor has to immutable fields:

Despite the PEARL feature identifiers cannot be edited, it is possible to customize the naming strategy. The settings dialog (accessed by clicking the button with the cog icon, next to the spreadsheet file picker) allows to choose among three choices:

Let's go back to the description of the header editor. Besides the two fields just described, there are two panels: one for the nodes and one for the graph section.
The nodes panel allows only to delete the nodes or to change minor details (like the language of literal converters, or the configuration of uri converters), it is not possible to create nodes directly from here, so let's focus to the Graph applications panel.

There are two kind of graph application:

Simple graph application

In most cases, the triplification of the content of a column doesn't require to define a complex graph pattern, the desired output could stick to the simple subject-predicate-object pattern. In this cases the creation of a simple graph application is enough.

The above form guides the user in the creation of a simple graph application. The primary information is the property to be bound to the column, which have to be looked up on the underlying ontology. Depending on the chosen property, the range field is automatically filled (this is read only, it is just useful to inform the user) as well as the range type combobox which the available value are resource, plain literal and typed literal. The selected value here affects the nature of a further field, indeed in case of resource, a type field optionally allows to assert a type for the column content (it will determine an additional triple like cell_content rdf:type chosen_type), in case of plain literal, a lang field allows to specify the language and finally in case of typed literal a datatype field allows to specify the datatype of the content.
The last field to fill is node. From the dropdown combobox it can be selected one of the nodes already defined in the current header, or click on the + button for creating a new one. Note that in the example shown in the figure, there is already a node defined. This is defined automatically by the system since, according the heuristics, the @en suffix in the preferred label@en header implicity associates a language tagged value (a literal node) to the comlun content.

If the user chooses to create a new node, a form like the following is prompted.

Here the user needs to provide:

The simple graph application determines, in the PEARL nodes section, the serialization of triples like the following in the nodes and graph sections:

nodes = { ... nodeId node_conversion UIMA_feature/value ... } graph = { ... $subject chosen_predicate $nodeId ... }

What if the user wants to define a more complex pattern in the graph section? For example, what if from the content under the preferred label@en column it is wanted to generate a reified label (like in SKOSXL)? The simple graph application doesn't support this intent, so it is necessary to create an advanced graph application.

Advanced graph application

The dialog for creating an advanced graph application consists in two panels: Nodes and Graph. The first basically allows to manage the nodes that are used in the graph pattern. The Graph panel provides a textare for writing the graph pattern. This pattern will be written as it is in the resultant PEARL code.

The following figure shows an advanced graph application that fits the configuration of an header for a SKOSXL lexicalization, indeed it defines two nodes, one for the skosxl:Label IRI and one for the literal form, and then refers to them in the graph pattern.

In the graph panel there is also a sub-panel that lists the prefix-namespace mappings defined in the VB project and eventually allows to define further mappings useful in the graph pattern.

The creation of this kind of graph applications could be quite complex, so Sheet2RDF provides the possibility to store and load advanced graph applications in an internal storage in order to reuse them. By default, the storage contains already a factory-provided configuration for the mapping of skosxl:Labels. This storage can be accessed (for loading or storing) by clicking on the "floppy-disk" buttons at the top-right corner of the dialog.

Mapping the subject

Beside the configuration of the columns, in the process of PEARL generation, Sheet2RDF needs to know how to mapping the subject. Clicking on the Subject mapping button, the following dialog pops up.

These are the fields foreseen in the form:

At the bottom of this form there is a panel for the configuration of the CODA converter that will generate the subject node. Note that in this case the only available converters are those capable to generate an IRI value.

The PEARL editor

Sheet2RDF triplifies the data by following the transformations indicated in the PEARL document. A basic PEARL document can be generated automatically by the system through the generate pearl button ('play' icon in the Spreadsheet preview panel). In most of the cases, if all the headers have been configured, it should not be necessary to edit the code, otherwise it may be necessary to eventually replace placeholders (%pls_provide_...%, that is, part of the PEARL which necessarily need to be filled by the user) with a ground value, or just for customizing the transformation differently from the default choices suggested by the system.
A PEARL document can be loaded from the filesystem or exported in order to reuse it later (2).

A side note: in the graph section, every patterns, whether deriving from simple or advanced graph application, excluding those reguarding the subject, are written in an OPTIONAL block. This is necessary in order to avoid the failure of the entire graph projection in case a feature is empty (a feature contains the value of the cell under the related column, so a feature is empty if the referenced cell is empty)

When the PEARL code is valid and completed, it can be executed simply by clicking on the "play" button (1). In case of errors in the code, a warning icon (with details shown in a tooltip) appears to the left of this button.

Triples generation

The bottom panel shows the RDF triples that the system generates. In this panel, the triples will be shown with an alternation of background color. The triples with the same color are generated from the same row of the datasheet. Clicking on a specific row of the triples panel, the source datasheet row will be highlighted as in the figure below.
Finally, after clicking on the add button (1), all the triples are added to the project.
The generated triples can be also exported in different RDF formats (2).

Triples preview