Paper: Towards Scalable Speech Act Recognition in Twitter: Tackling Insufficient Training Data

ACL ID W12-0603
Title Towards Scalable Speech Act Recognition in Twitter: Tackling Insufficient Training Data
Venue Workshop on Semantic Analysis in Social Media
Session  
Year 2012
Authors

Recognizing speech act types in Twitter is of much theoretical interest and practical use. Our previous research did not adequately address the deficiency of training data for this multi-class learning task. In this work, we set out by assuming only a small seed training set and experiment with two semi-supervised learning schemes, transductive SVM and graph-based label propagation, which can leverage the knowledge about unlabeled data. The efficacy of semi-supervised learning is established by our extensive experiments, which also show that transductive SVM is more suitable than graph-based label propagation for our task. The empirical findings and detailed evidences can contribute to scalable speech act recognition in Twitter.

@InProceedings{zhang-gao-li:2012:SASM2012,
  author    = {Zhang, Renxian  and  Gao, Dehong  and  Li, Wenjie},
  title     = {Towards Scalable Speech Act Recognition in Twitter: Tackling Insufficient Training Data},
  booktitle = {Proceedings of the Workshop on Semantic Analysis in Social Media},
  month     = {April},
  year      = {2012},
  address   = {Avignon, France},
  publisher = {Association for Computational Linguistics},
  pages     = {18--27},
  url       = {http://www.aclweb.org/anthology/W12-0603}
}