OBi-WANOntology Based WrApper iNduction |
OBI-WAN is an information extraction component featuring a combination of Wrapper Induction techniques enriched by Ontological Knowledge. In order to improve quality of extracted information, OBI-WAN exploits Knowledge Bases' data, both at Conceptual and Instance level, boosting the production and maintenance of extraction patterns.
In ever-changing frameworks like the World Wide Web, this hybrid approach supports strong adaptivity to new emerging concepts and a certain degree of independence from the specific web-sites considered during training of the system.
OBI-WAN's core module is based on Soderland's WHISK algorithm for wrapper induction [Soderland 99], .... [...TO COMPLETE, WORK IN PROGRESS...]