TutorialBank: Using a Manually-Collected Corpus for Prerequisite Chains, Survey Extraction and Resource Recommendation


The field of Natural Language Processing (NLP) is growing rapidly, with new research published daily along with an abundance of tutorials, codebases and other online resources. In order to learn this dynamic field or stay up-to-date on the latest research, students as well as educators and researchers must constantly sift through multiple sources to find valuable, relevant information. To address this situation, we introduce TutorialBank, a new, publicly available dataset which aims to facilitate NLP education and research. We have manually collected and categorized over 5,600 resources on NLP as well as the related fields of Artificial Intelligence (AI), Machine Learning (ML) and Information Retrieval (IR). Our dataset is notably the largest manually-picked corpus of resources intended for NLP education which does not include only academic papers. Additionally, we have created both a search engine and a command-line tool for the resources and have annotated the corpus to include lists of research topics, relevant resources for each topic, prerequisite relations among topics, relevant sub-parts of individual resources, among other annotations. We are releasing the dataset and present several avenues for further research.


Suggested Topics (up to Top 50)

Full Matches (full topic name in abstract)

Partial Matches (at least half of words topic name appear in abstract)

Suggested Resources

Uses abstract to search the content of resources available in Topics. Sorted by relevance.

# Title Author Topic
1 Opinion mining and sentiment analysis Bo Pang and Lillian Lee 1122
2 A survey of available corpora for building data-driven dialogue systems Iulian Vlad Serban, Ryan Lowe, Laurent Charlin, Joelle Pineau 9999
3 Scholarly Data Mining: Making Sense of Scientific Literature Horacio Saggion & Francesco Ronzano 1151
4 Natural Language Processing for Information Extraction Sonit Singh 1089
5 Query Expansion Techniques for Information Retrieval: a Survey Hiteshwar Kumar Azad, Akshay Deepak 1174
6 Natural Language Processing for Intelligent Access to Scientific Information Francesco Ronzano, Horacio Saggion 1089
7 Neural Approaches to Conversational AI Jianfeng Gao, Michel Galley, Lihong Li 1200
8 Natural Language Data Management and Interfaces Yunyao Li, Davood Rafiei 1151
9 Endangered Languages Richard Littauer 999
10 Automatic Summarization Ani Nenkova and Kathleen McKeown 1129
11 Question Answering Techniques for the World Wide Web Jimmy Lin, Boris Katz 1125
12 Neural Information Retrieval: A Literature Review Ye Zhang, Md Mustafizur Rahman, Alex Braylan, Brandon Dang, Heng-Lu... 1169
13 Sentiment Analysis and Opinion Mining Bing Liu 1122
14 Artificial Intelligence and Games Georgios N. Yannakakis and Julian Togelius 1307
15 Open-Domain Question Answering Mark Andrew Greenwood 1125
16 Content-based citation analysis: The next generation of citation analysis Ying Ding, Guo Zhang, Tamy Chambers, Min Song, Xiaolong Wang, Cheng... 9999
17 Computational Sociolinguistics: A Survey Dong Nguyen, A. Seza Dogruöz, Carolyn Penstein Rosé, Franciska de Jong 9999
18 Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation Albert Gatt, Emiel Krahmer 1136
19 Text Data Management and Analysis M. Tamer O zsu 1581
20 Rapid Understanding of Scientific Paper Collections: Integrating Statistics, Text Analytics, and Visualization Cody Dunne, Ben Shneiderman, Robert Gove, Judith Klavans, Bonnie Dorr 9999
21 Jumping NLP Curves: A Review of Natural Language Processing Research Erik Cambria, Bebo White 1053
22 Recent Advances in Document Summarization Jin-ge Yao, Xiaojun Wan, Jianguo Xiao 1129
23 From Word to Sense Embeddings: A Survey on Vector Representations of Meaning Jose Camacho-Collados, Mohammad Taher Pilehvar 1186
24 Applications of Social Media Text Analysis Atefeh Farzindar, Diana Inkpen 1231
25 Gimli: open source and high-performance biomedical name recognition David Campos, Sergio Matos, Jose Oliveira 9999
26 Natural language based financial forecasting: a survey Frank Z. Xing, Erik Cambria, Roy E. Welsch 9999
27 The Handbook of Computational Linguistics and Natural Language Processing "Alexander Clark, Chris Fox, and Shalom Lappin" 1557
28 Named Entity Recognition and Classification David Nadeau, Satoshi Sekine 1089
29 Information Extraction Sunita Sarawagi 1089
30 A Survey of Automated Text Simplification Matthew Shardlow 1134
31 Computational Analysis of Affect and Emotion in Language Saif M. Mohammad, Cecilia Ovesdotter Alm 1122
32 Spoken Content Retrieval: A Survey of Techniques and Technologies Martha Larson and Gareth J. F. Jones 1252
33 Citation Classification for Behavioral Analysis of a Scientific Field David Jurgens, Srijan Kumar, Raine Hoover, Dan McFarland, Dan Jurafsky 9999
34 A Survey on Multi-output Learning "Donna Xu, Yaxin Shi, Ivor W. Tsang, Yew-Soon Ong, Chen Gong, and X... 1563
35 A Review of 40 Years of Cognitive Architecture Research: Focus on Perception, Attention, Learning and Applications Iuliia Kotseruba, Oscar J Avella Gonzalez, John K Tsotsos 9999
36 A Survey on Lexical Simplification Gustavo H. Paetzold, Lucia Specia 1134
37 A Benchmark Comparison of State-of-the-Practice Sentiment Analysis Methods Filipe Nunes Ribeiro, Matheus Araújo, Pollyanna Gonçalves, Fabrício... 9999
38 Word Sense Disambiguation: A Survey Roberto Navigli 1124
39 Test Collection Based Evaluation of Information Retrieval Systems Mark Sanderson 1171
40 Reflections on Sentiment/Opinion Analysis Jiwei Li, Eduard H. Hovy 9999
41 A Survey of Text Summarization Techniques Ani Nenkova, Kathleen McKeown 1129
42 Survey on Evaluation Methods for Dialogue Systems Jan Deiru 1126
43 A Survey of Code-switched Speech and Language Processing Sunayana Sitaram 1252
44 Multiword Expression Processing: A Survey Mathieu Constant, Gül?en Eryi?it, Johanna Monti, Lonneke van der Plas 1122
45 Evaluative language beyond bags of words: Linguistic insights and computational applications Benamara, Farah and Taboada, Maite and Mathieu, Yannick 1265
46 Automated Question Answering: Review of the Main Approaches Andrea Andrenucci, Eriks Sneiders 1125
47 Opinion Mining: Exploiting the Sentiment of the Crowd Diana Maynard, Adam Funk, Kalina Bontcheva 1122
48 Neural Models for Information Retrieval Bhaskar Mitra, Nick Craswell 1089
49 The Creation and Analysis of a Website Privacy Policy Corpus Shomir Wilson, Florian Schaub, Aswarth Abhilash Dara, Frederick Liu... 9999
50 Earlier Web usage statistics as predictors of later citation impact Tim Brody, Stevan Harnad, Leslie Carr 9999