
OPIS: Um método para identificação e busca de páginas-objeto apoiado por realimentação de relevância e classificação de páginas web


Thesis and Dissertation Workshop (WTDBD)

Topics of interest:

Páginas-objeto, Busca-objeto, Classificação de Páginas Web


This paper proposes a new method for identifying and searching object pages named OPIS (acronyms to Object Page Identifying and Searching). Object pages are pages that represent exactly one inherent real-world object on the web. The purpose of OPIS is to address the search for these real-world objects pages, since the General Search Engines (GSEs) cannot answer satisfactorily this type of search today. The kernel of our method is to adopt feedback relevance and machine learning techniques in the task of content-based pages classification. OPIS, when integrated into a GSE, enables the filtering of object pages, in which only pages classified as object pages are retrieved by user keyword queries instead of all pages that contain those words. Preliminary experiments show that OPIS improved on average 37% of the precision in 20 (p@20) of the results retrieved when compared with a GSE.


Miriam Pizzatto Colpo, Edimar Manica, Renata Galante

Baixar o PDF