LectureBank: A Collection of Lecture Notes and Topics

Our LectureBank dataset contains English lecture files collected from university courses in mainly Natural Language Processing (NLP) field. Besides, each file is manually classified according to an existing taxonomy. Together with the dataset, we include 322 manually-labeled prerequisite relation topics. Check our most recent paper "R-VGAE: Relational-variational Graph Autoencoder for Unsupervised Prerequisite Chain Learning" (see GitHub) by Irene Li, Alexander Fabbri, Swapnil Hingmire, Dragomir Radev.

TutorialBank: Learning NLP Made Easier

NLP is rapidly growing, and, as a result, advancing in the field can seem daunting to the student or the researcher. To help the growing NLP community and advance research related to NLP for educational applications, we introduced a new corpus in our ACL 2018 paper "TutorialBank: Using a Manually-Collected Corpus for Prerequisite Chains, Survey Extraction and Resource Recommendation" (see GitHub) by Alexander Fabbri, Irene Li, Prawat Trairatvorakul, Yijiao He, Wei Tai Ting, Robert Tung, Caitlin Westerfield and Dragomir Radev.


May 26, 2022

We release a new paper "CLICKER: A Computational LInguistics Classification Scheme for Educational Resources" by the LILY group at Yale!

May 25, 2022

We are pleased to release the new, 2022 version of the AAN database with over 24,000 resources and over 7,000 lecture notes! Please check out our blog post here for more information.


Number of papers 24622
Number of authors 18904
Number of venues 374
Number of citations 124884
Number of resources 24766