What?

CHAOS: A robust syntactic parser for Italian and for English. The system implements a modular and lexicalised approach to the syntactic parsing problem. It is based on the notion of eXtended Dependency Graph (XDG) that has been seen as a useful representation mechanism in a shallow parsing approach. The system offers a collection of modules for designing parsing architectures. The pool of modules consists of:

  • a tokenizer, matching words from character streams
  • a yellow page look-up module that matches named entities existing in catalogues
  • a morphologic analyser that attaches (possibly ambiguous) syntactic categories and morphological interpretations to each word
  • a named entities matcher that recognizes complex named entities according to special purpose grammars
  • a rule-based part-of-speech tagger
  • a POS disambiguation module that resolves potential conflicts among the results of the POS tagger and the morphologic analyser
  • a chunker
  • a verb argument detector
  • a shallow syntactic analyser

The overall system is seen as a JAVA library offering a standard representation for modules implemented in different programming languages.

Download?

The download is protected, please send an e-mail to chaos@info.uniroma2.it to obtain the account to access the protected area. You should indicate:

  • Your name
  • Your affiliation
  • The purpose of the research

(download here)

Documentation?

In the package you will find:

Some information about the method can be found in "Publications"

The User's Manual is in preparation.

Software requirements?

  • Windows 32 / Linux (gcc 3.x)
  • Java 1.5.*