DictInterface Installation Instructions
For DictInterface
2.0 (Linguistic Watermark 2.0)
Installation:
- Download the
LinguisticWatermark
package and the Dict Interface
- Copy all the content of this Dict Interface zip file, which are:
- DictInterface.jar
- gnu-regexp-1.0.8.jar
- javadict-1.4.jar
- the Dict folder
inside your working directory
- In the DictInterface folder, you will find:
- dictionaries.dtd
- strategies.dtd
- databases.xml
- strategies.xml
- Dict.properties
- some example dictionaries
- In order to have the whole stuff working, you have to:
- configure the Linguistic Watermark configuration file,
indicating where the databases.xml
file is located in your hard disk and specifying the set of
available resources
- specify the dictionaries you are planning to use in file
databases.xml
(just add a row for every dictionary. Formatting instructions are in the file itself)
The Linguistic Watermark graphic configuration utility can be used for
properly setting the Dict Interface according to the dictionaries available in
your computer. Anyway, the instructions below refer to the xml configuration
file.
Below are represented the interface and a instances sections of the
Linguistic Watermark configuration file (see Linguistic Watermark
User Manual for details):
- <interface
id="DICT"
path="plugins/it.uniroma2.info.ai-nlp.ontoling/DictInterface.jar">
<property
description="path
where is databases.xml"
id="dictionary_path"
value="plugins/it.uniroma2.info.ai-nlp.ontoling/Dict"
/>
</interface>
the dictionary_path interface property tells the
Dict interface where to look for the
databases.xml file
- <instance
id="dan-eng"
interface="DICT">
<property
description="source
language"
id="source_language"
value="dk"
/>
<property
description="destination
language"
id="destination_language"
value="en"
/>
</instance>
- <instance
id="deu_eng"
interface="DICT">
<property
description="source
language"
id="source_language"
value="de"
/>
<property
description="destination
language"
id="destination_language"
value="en"
/>
</instance>
the source and destination
languages instance properties are used to tell Dict Interface which specific
dictionary is being loaded
The Dict Interface is based on
JavaDICT, a java library
for accessing dict dictionaries written by Luis Parravicini (see end of this
page). The databases.xml file is configured
according to the instructions given in the
JavaDICT site, anyway, the
following example may be of help for a brief configuration.
<dictionaries base="plugins\it.uniroma2.info.ai-nlp.ontoling\Dict\">
<dictionary
name="dk-en"
db="dan-eng.dict.dz"
index="dan-eng.index"
relative="true"
type="compressed"
/>
<dictionary
name="de-en"
db="deu-eng.dict.dz"
index="deu-eng.index"
relative="true"
type="compressed"
/>
<dictionary
name="de-it"
db="deu-ita.dict.dz"
index="deu-ita.index"
relative="true"
type="compressed"
/>
<dictionary
name="en-de"
db="eng-deu.dict.dz"
index="eng-deu.index"
relative="true"
type="compressed"
/>
<dictionary
name="en-fr"
db="eng-fra.dict.dz"
index="eng-fra.index"
relative="true"
type="compressed"
/>
</dictionaries>
The user must only pay attention to adopt identifiers (values for the name
attribute) composed by the aggregation of the two languages (source and
destination) set for the instances in the Linguistic Watermark configuration
file
Depending on the distribution of the Dict Interface which you have downloaded,
you may have found some dictionaries inside it. More dictionaries can be added following the links at:
http://www.dict.org/links.html
THIRD PARTY LIBRARIES INCLUDED IN THE DICT-INTERFACE RELEASE:
the DictInterface release includes the following third party libraries:
gnu-regexp-1.0.8.jar
javadict-1.4.jar.
JavaDICT is a server
for the DICT protocol written in Java. It's the only (so far) DICT server that
allows you to access dictionaries stored in a database (currently only
PostgreSQL is supported). JavaDICT is released under the terms of the GNU
General Public License (GPL). The author of javadict is
Luis Parravicini.
SHORT NOTES ON DICT DICTIONARIES:
The Dictionary Server Protocol (DICT) is a TCP transaction based query/response
protocol that allows a client to access dictionary definitions from a set of natural
language dictionary databases.
For more info, visit: www.dict.org