Paper: Web Corpus Mining By Instance Of Wikipedia

ACL ID W06-1710
Title Web Corpus Mining By Instance Of Wikipedia
Venue Workshop On Web As Corpus
Session  
Year 2006
Authors

In this paper we present an approach to structure learning in the area of web doc- uments. This is done in order to approach the goal of webgenre tagging in the area of web corpus linguistics. A central outcome of the paper is that purely structure ori- ented approaches to web document classi- fication provide an information gain which may be utilized in combined approaches of web content and structure analysis.

Bibtex not found.