View Project


TutorialBank: Using a Manually-Collected Corpus for Prerequisite Chains, Survey Extraction and Resource Recommendation


The field of Natural Language Processing (NLP) is growing rapidly, with new research published daily along with an abundance of tutorials, codebases and other online resources. In order to learn this dynamic field or stay up-to-date on the latest research, students as well as educators and researchers must constantly sift through multiple sources to find valuable, relevant information. To address this situation, we introduce TutorialBank, a new, publicly available dataset which aims to facilitate NLP education and research. We have manually collected and categorized over 5,600 resources on NLP as well as the related fields of Artificial Intelligence (AI), Machine Learning (ML) and Information Retrieval (IR). Our dataset is notably the largest manually-picked corpus of resources intended for NLP education which does not include only academic papers. Additionally, we have created both a search engine and a command-line tool for the resources and have annotated the corpus to include lists of research topics, relevant resources for each topic, prerequisite relations among topics, relevant sub-parts of individual resources, among other annotations. We are releasing the dataset and present several avenues for further research.



Suggested Topics

Full Matches (full topic name in abstract)

Partial Matches (at least half of words topic name appear in abstract)

Suggested Resources

Uses abstract to search the content of resources available in Topics. Sorted by relevance.

# Title Author Topic Medium Score
1 Highlights of EMNLP 2017: Exciting Datasets, Return of the Clusters, and More! Sebastian Ruder 641 resource 270.23
2 Diversity and network coherence as indicators of interdisciplinarity: case studies in bionanoscience Ismael Rafols, Martin Meyer 999 paper 250.63
3 Do Altmetrics Work? Twitter and Ten Other Social Web Services Mike Thelwall, Stefanie Haustein, Vincent Larivière, Cassidy R. Sugimoto 999 paper 239.78
4 Codra: A Novel Discriminative Framework for Rhetorical Analysis Shafiq Joty, Giuseppe Carenini, Raymond T. Ng 999 paper 238.33
5 NLP’s generalization problem, and how researchers are tackling it Ana Marasovic 711 resource 232.51
7 An index to quantify an individual’s scientific research output that takes into account the effect of multiple coauthorship J. E. Hirsch 999 paper 228.30
8 State-of-the-art neural coreference resolution for chatbots Thomas Wolf 756 tutorial 227.76
9 The data that transformed AI research—and possibly the world Dave Gershgorn 107 resource 227.21
10 Similarity-driven Semantic Role Induction via Graph Partitioning Joel Lang, Mirella Lapata 999 paper 225.96
11 Clustering cliques for graph-based summarization of the biomedical research literature Han Zhang, Marcelo Fiszman, Dongwook Shin, Bartomiej Wilkowski, Thomas Rindflesch 999 paper 222.55
12 The history and meaning of the journal impact factor Eugene Garfield 999 paper 220.53
13 Wordnet, getting your hands dirty bogdani 315 resource 218.03
14 Train Neural Machine Translation Models with Sockeye Felix Hieber, Tobias Domhan 753 tutorial 217.92
15 DEEP LEARNING FOR CHATBOTS, PART 1 - INTRODUCTION Denny Britz 445 tutorial 213.20
16 NLP's ImageNet moment has arrived Sebastian Ruder 862 resource 211.15
17 Discriminative Syntax-based Word Ordering for Text Generation Yue Zhang, Stephen Clark 999 paper 208.58
18 Neural Information Retrieval: At the End of the Early Years Kezban Dilek Onal, Ye Zhang, Ismail Sengor Altingovde, Md Mustafizur Rahman, Pinar Karagoz, Alex Braylan, Brandon Dang, Heng-Lu Chang, Henna Kim, Quinten McNamara, Aaron Angert, Edward Banner, Vivek K 713 resource 208.10
19 Recurrent Neural Networks Tutorial, Part 2 - Implementing a RNN with Python, Numpy, and Theano Denny Britz 741 tutorial 207.33
20 A general framework for analysing diversity in science, technology and society Andy Stirling 999 paper 206.40
21 Rapid Understanding of Scientific Paper Collections: Integrating Statistics, Text Analytics, and Visualization Cody Dunne, Ben Shneiderman, Robert Gove, Judith Klavans, Bonnie Dorr 999 paper 200.74
22 Lexicalization and Generative Power in Ccg Marco Kuhlmann, Alexander Koller, Giorgio Satta 999 paper 198.95
23 How do we capture structure in relational data? Matthew Das Sarma 711 resource 197.96
24 The h’-Index, Effectively Improving the h-Index Based on the Citation Distribution Chun-Ting Zhang 999 paper 195.03
25 Earlier Web usage statistics as predictors of later citation impact Tim Brody, Stevan Harnad, Leslie Carr 999 paper 192.87
26 A survey of transfer learning Karl Weiss, Taghi M. Khoshgoftaar and DingDing Wang 978 resource 191.36
27 DeepMind has a bigger plan for its newest Go-playing AI Dave Gershgorn 811 resource 191.16
28 A social network's changing statistical properties and the quality of human innovation Brian Uzzi 999 paper 189.95
29 Introducing Gluon - An Easy-to-Use Programming Interface for Flexible Deep Learning Vikram Madan 731 resource 188.42
30 Recurrent Neural Network Tutorial, Part 4 – Implementing a GRU/LSTM RNN with Python and Theano Denny Britz 742 tutorial 188.28
31 An Intuitive Guide to Linear Algebra Kalid Azad 121 tutorial 185.88
32 Recent Advances in Document Summarization Jin-ge Yao, Xiaojun Wan, Jianguo Xiao 421 survey 185.33
33 Machine Learning for Humans Vishal Maini, Samer Sabri 134 tutorial 183.39
34 Future impact: Predicting scientific success Daniel E. Acuna, Stefano Allesina, Konrad P. Kording 999 paper 183.23
35 RNNs in Tensorflow, a Practical Guide and Undocumented Features Denny Britz 741 tutorial 177.61
36 Bayesian Statistics explained to Beginners in Simple English NSS 102 tutorial 176.94
37 Relevance of Unsupervised Metrics in Task-Oriented Dialogue for Evaluating Natural Language Generation Author Unknown 952 resource 176.90
38 Gimli: open source and high-performance biomedical name recognition David Campos, Sergio Matos, Jose Oliveira 999 paper 175.90
39 Learning Reinforcement Learning (with Code, Exercises and Solutions) Denny Britz 713 tutorial 175.08
40 The e-Index, Complementing the h-Index for Excess Citations Chun-Ting Zhang 999 paper 174.84
41 Recurrent Neural Networks Tutorial, Part 1 - Introduction to RNNs Denny Britz 741 tutorial 174.04
42 Natural Language Processing in Artificial Intelligence is almost human-level accurate. Worse yet, it gets smart! Rafal 133 tutorial 174.01
43 The Future (and Present) of Artificial Intelligence AMA Various Authors 811 resource 173.09
44 Rohan #2: Artificial intelligence, ?Progress/?Time Rohan Kapur 811 tutorial 172.76
45 A Practitioner's Guide to Natural Language Processing (Part I)?—?Processing & Understanding Text Dipanjan (DJ) Sarker 112 resource 172.34
46 Recurrent Neural Networks Tutorial, Part 3- Backpropagation Through Time and Vanishing Gradients Denny Britz 741 tutorial 171.99
47 Citation Analysis and Discourse Analysis Revisited Howard D. White 999 paper 170.23
48 Negated bio-events: analysis and identification Raheel Nawaz, Paul Thompson, Sophia Ananiadou 999 paper 169.99
49 Is science becoming more interdisciplinary? Measuring and mapping six research fields over time Alan L. Porter, Ismael Rafols 999 paper 169.43
50 Deep Learning from first principles in Python, R and Octave – Part 3 Tinniam V Ganesh 711 resource 168.13
51 Little Science, Big Science...and Beyond Derek J. Price 999 paper 167.94
52 Quora Duplicate Questions Corpus Quora 151 corpus 167.20
53 K-Means & Other Clustering Algorithms: A Quick Intro with Python Nikos Koufos 571 tutorial 166.77
54 A Complete Tutorial to Learn Data Science with Python from Scratch Kunal Jain 131 tutorial 165.63
55 25 Open Datasets for Deep Learning Every Data Scientist Must Work With Pranav Dar 731 resource 165.34
56 Natural Language Processing (NLP) for Computational Social Science Cristian Danescu-Niculescu-Mizil, Lillian Lee 133 tutorial 165.01
57 Transfer Learning - Machine Learnings Next Frontier Sebastian Ruder 978 tutorial 164.77
58 Automatic Labeling of Semantic Roles Daniel Gildea, Daniel Jurafsky 999 paper 164.46
59 The Definitive Guide to Natural Language Processing Javier Couto 133 tutorial 161.68
60 The 7 NLP Techniques That Will Change How You Communicate in the Future (Part I) James Le 112 resource 161.44
61 From Natural Language Processing to Ar4ficial Intelligence Jonathan Mugan 133 tutorial 160.93
62 A Hirsch-type index for journals Tibor Braun, Wolfgang Glänzel, András Schubert 999 paper 160.83
63 Introduction to Neural Machine Translation with GPUs (part 3) Kyunghyun Cho 753 tutorial 160.54
64 Neural networks: training with backpropagation. Jeremy Jordan 711 resource 160.52
65 Quadratic entropy and analysis of diversity C. R. Rao 999 paper 160.18
66 A Comparative Analysis of ChatBots APIs Author Unknown 921 resource 159.90
67 The 7 NLP Techniques That Will Change How You Communicate in the Future (Part II) James Le 133 resource 159.15
68 Introductory Guide to Artificial Intelligence Egor Dezhic 811 resource 158.63
69 Introduction to Natural Language Processing (NLP) 2016 Matt Kiser 133 tutorial 157.62
70 Webcrow: A Web-Based Crosswords Solver Giovanni Angelini, Marco Ernandes, Marco Gori 999 paper 157.24
71 The Alignment Template Approach to Statistical Machine Translation Franz Josef Och, Hermann Ney 999 paper 156.98
72 The Evolution and Core Concepts of Deep Learning & Neural Networks Guest Blog 711 tutorial 156.79
73 Listen, Attend, and Walk: Neural Mapping of Navigational Instructions to Action Sequences Hongyuan Mei, Mohit Bansal, Matthew R. Walter 999 paper 156.67
74 Ultimate Guide to Understand & Implement Natural Language Processing (with codes in Python) Shivam Bansal 131 tutorial 156.17
75 Modelling, visualising and summarising documents with a single convolutional neural network Misha Denil, Alban Demiraj, Nal Kalchbrenner, Phil Blunsom, Nando de Freitas 744 paper 156.07
76 100+ Interesting Data Sets for Statistics Robb Seaton 999 corpus 155.35
77 News Article Wikipedia Dataset Author Unknown 999 library 154.89
78 Image-to-Image Translation in Tensorflow Christopher Hesse 731 tutorial 154.88
79 Hierarchical Phrase-Based Translation David Chiang 999 paper 153.62
80 Web Scraping in Python using Scrapy (with multiple examples) Mohd Sanad Zaki Rizvi 999 resource 153.60
81 Doc2vec tutorial Radim Rehurek 721 tutorial 153.12
82 A Brief Introduction to Graphical Models and Bayesian Networks Kevin Murphy 967 resource 152.08
83 Complete guide to build your own Named Entity Recognizer with Python bogdani 232 resource 151.75
84 Learning about the world through video Moritz Mueller-Freitag 811 resource 151.27
85 Comparing Top Deep Learning Frameworks: Deeplearning4j, Torch, Theano, TensorFlow, Caffe, Paddle, MxNet, Keras & CNTK Skymind 731 resource 151.18
86 ICML+ACL’18: Structure Back in Play, Translation Wants More Context Andre Martins 956 resource 151.10
87 Models for predicting and explaining citation count of biomedical articles Lawrence D. Fu, Constantin Aliferis 999 paper 151.09
88 Deep Learning in NLP Vered Shwartz 711 resource 151.07
89 Machine Learning Morteza Shahriari Nia 107 tutorial 151.06
90 A Beginner’s Guide to Deep Reinforcement Learning Adam Gibson, Chris Nicholson, Josh Patterson 857 library 150.36
91 Tombones Computer Vision Blog Tomasz Malisiewicz 958 resource 150.33
92 Four deep learning trends from ACL 2017: Part 1 Abigail See 713 resource 149.75