View Project


TutorialBank: Using a Manually-Collected Corpus for Prerequisite Chains, Survey Extraction and Resource Recommendation


The field of Natural Language Processing (NLP) is growing rapidly, with new research published daily along with an abundance of tutorials, codebases and other online resources. In order to learn this dynamic field or stay up-to-date on the latest research, students as well as educators and researchers must constantly sift through multiple sources to find valuable, relevant information. To address this situation, we introduce TutorialBank, a new, publicly available dataset which aims to facilitate NLP education and research. We have manually collected and categorized over 5,600 resources on NLP as well as the related fields of Artificial Intelligence (AI), Machine Learning (ML) and Information Retrieval (IR). Our dataset is notably the largest manually-picked corpus of resources intended for NLP education which does not include only academic papers. Additionally, we have created both a search engine and a command-line tool for the resources and have annotated the corpus to include lists of research topics, relevant resources for each topic, prerequisite relations among topics, relevant sub-parts of individual resources, among other annotations. We are releasing the dataset and present several avenues for further research.


Login to edit or delete this resource.

Suggested Topics (up to Top 50)

Full Matches (full topic name in abstract)

Partial Matches (at least half of words topic name appear in abstract)

Suggested Resources

Uses abstract to search the content of resources available in Topics. Sorted by relevance.

# Title Author Topic Medium Score
1 Opinion mining and sentiment analysis Bo Pang and Lillian Lee 1122
2 Scholarly Data Mining: Making Sense of Scientific Literature Horacio Saggion & Francesco Ronzano 1151
3 Natural Language Processing for Intelligent Access to Scientific Information Francesco Ronzano, Horacio Saggion 1089
4 Question Answering Techniques for the World Wide Web Jimmy Lin, Boris Katz 1125
5 Endangered Languages Richard Littauer 999
6 Natural Language Data Management and Interfaces Yunyao Li, Davood Rafiei 1151
7 Automatic Summarization Ani Nenkova and Kathleen McKeown 1129
8 Content-based citation analysis: The next generation of citation analysis Ying Ding, Guo Zhang, Tamy Chambers, Min Song, Xiaolong Wang, Cheng... 9999
9 Sentiment Analysis and Opinion Mining Bing Liu 1122
10 Text Data Management and Analysis M. Tamer O zsu 1581
11 Artificial Intelligence and Games Georgios N. Yannakakis and Julian Togelius 1307
12 Rapid Understanding of Scientific Paper Collections: Integrating Statistics, Text Analytics, and Visualization Cody Dunne, Ben Shneiderman, Robert Gove, Judith Klavans, Bonnie Dorr 9999
13 Jumping NLP Curves: A Review of Natural Language Processing Research Erik Cambria, Bebo White 1053
14 Applications of Social Media Text Analysis Atefeh Farzindar, Diana Inkpen 1231
15 Recent Advances in Document Summarization Jin-ge Yao, Xiaojun Wan, Jianguo Xiao 1129
16 Gimli: open source and high-performance biomedical name recognition David Campos, Sergio Matos, Jose Oliveira 9999
17 Named Entity Recognition and Classification David Nadeau, Satoshi Sekine 1089
18 Information Extraction Sunita Sarawagi 1089
19 Spoken Content Retrieval: A Survey of Techniques and Technologies Martha Larson and Gareth J. F. Jones 1252
20 Computational Analysis of Affect and Emotion in Language Saif M. Mohammad, Cecilia Ovesdotter Alm 1122
21 Automated Question Answering: Review of the Main Approaches Andrea Andrenucci, Eriks Sneiders 1125
22 Natural language based financial forecasting: a survey Frank Z. Xing, Erik Cambria, Roy E. Welsch 9999
23 Word Sense Disambiguation: A Survey Roberto Navigli 1124
24 Test Collection Based Evaluation of Information Retrieval Systems Mark Sanderson 1171
25 The Creation and Analysis of a Website Privacy Policy Corpus Shomir Wilson, Florian Schaub, Aswarth Abhilash Dara, Frederick Liu... 9999
26 A Survey of Text Summarization Techniques Ani Nenkova, Kathleen McKeown 1129
27 Evaluative language beyond bags of words: Linguistic insights and computational applications Benamara, Farah and Taboada, Maite and Mathieu, Yannick 1265
28 Opinion Mining: Exploiting the Sentiment of the Crowd Diana Maynard, Adam Funk, Kalina Bontcheva 1122
29 Tutorial on BioText Mining Martin Krallinge 1150
30 Multiword Expression Processing: A Survey Mathieu Constant, Gül?en Eryi?it, Johanna Monti, Lonneke van der Plas 1122
31 Introduction to Text Summarization and Other Information Access Technologies Horacio Saggion 1129
32 Earlier Web usage statistics as predictors of later citation impact Tim Brody, Stevan Harnad, Leslie Carr 9999
33 Semantic Parsing 2, Question Answering Mohit Bansal 1125
34 Natural language processing: an introduction Prakash M Nadkarni, Lucila Ohno-Machado, Wendy W Chapman 1052
35 Deep Learning for Sentiment Analysis : {A} Survey Lei Zhang and Shuai Wang and Bing Liu 1045
36 A Survey on Automatic Text Summarization Dipanjan Das, Andre F.T. Martins 1125
37 Information Extraction Katharina Kaiser and Silvia Miksch 1089
38 Open-Domain Question Answering John Prager 1125
39 TabMCQ: A Dataset of General Knowledge Tables and Multiple-choice Questions Sujay Kumar Jauhar, Peter Turney, Eduard Hovy 9999
40 Neural Information Retrieval: At the End of the Early Years Kezban Dilek Onal, Ye Zhang, Ismail Sengor Altingovde, Md Mustafizu... 1183
41 Characterizing Interdisciplinarity of Researchers and Research Topics Using Web Search Engines Hiroki Sayama, Jin Akaishi 9999
42 Highlights of EMNLP 2017: Exciting Datasets, Return of the Clusters, and More! Sebastian Ruder 1180
43 Coarse Discourse: A Dataset for Understanding Online Discussions Praveen Paritosh, Ka Wong 1251
44 Libraries N/A 9999
45 All Fingers are not Equal: Intensity of References in Scientific Articles Tanmoy Chakraborty, Ramasuri Narayanam 9999
46 Multiword Expression Processing: A Survey Mathieu Constant, Gülşen Eryiğit, Johanna Monti, Lonneke van der Pl... 1254
47 Multiword expression processing: A survey Constant, Mathieu and Eryi{\u{g}}it, G{\ü}l{\c{s}}en and Monti, Joh... 1556
48 Multilingual Sentiment and Subjectivity Analysis Rada Mihalcea, Carmen Banea, Janyce Wiebe 1122
49 Neural Approaches to Conversational AI: Question Answering, Task-Oriented Dialogue and Chatbots: A Unified View Jianfeng Gao, Michel Galley, Lihong Li 1141
50 Search Engines Information Retrieval in Practice W. Bruce Croft, Donald Metzler, Trevor Strohman 1170