Paper: Building Webcorpora of Academic Prose with BootCaT

ACL ID W10-1504
Title Building Webcorpora of Academic Prose with BootCaT
Venue Proceedings of the NAACL HLT 2010 Sixth Web as Corpus Workshop
Session  
Year 2010
Authors

A procedure is described to gather corpora of aca- demic writing from the web using BootCaT. The procedure uses terms distinctive of different regis- ters and disciplines in COCA to locate and gather web pages containing them.

@InProceedings{dillon:2010:WAC6,
  author    = {Dillon, George},
  title     = {Building Webcorpora of Academic Prose with BootCaT},
  booktitle = {Proceedings of the NAACL HLT 2010 Sixth Web as Corpus Workshop},
  month     = {June},
  year      = {2010},
  address   = {NAACL-HLT, Los Angeles},
  publisher = {Association for Computational Linguistics},
  pages     = {26--31},
  url       = {http://www.aclweb.org/anthology/W10-1504}
}