Resources
More links will be added to this page. If you have one's you'd like to add, please post on Campuswire.
Books
- Recommended: Jurafsky and Martin: Speech and Language Processing (2nd edition), Prentice Hall (2008) Amazon Links to an external site., 3rd edition Links to an external site. (w/ draft chapters)
- Yoav Goldberg: Neural Network Methods for NLP, Morgan and Claypool (2017) Online version Links to an external site. (free if on campus)
- Manning and Schuetze, Foundations of Statistical Natural Language Processing, MIT Press (1999) Website Links to an external site., Amazon Links to an external site., Online version Links to an external site. (free if on campus)
- Jacob Eisenstein, Natural Language Processing, MIT Press (forthcoming) Draft Links to an external site.
- Noah Smith, Linguistic Structure Prediction, Morgan and Claypool (2011) Website Links to an external site., Online version Links to an external site. (free if on campus)
- Emily Bender, Linguistic Fundamentals for Natural Language Processing: 100 Essentials from Morphology and Syntax, Morgan and Claypool (2013) Online version Links to an external site. (free if on campus)
- Bird, Klein, and Loper, Natural Language Processing with Python, O’Reilly (2009) Amazon Links to an external site.
Papers
The following are neither the most representative, influential, or "best" papers in NLP, but instead a somewhat diverse selection of recent papers.
- J. Howard and S. Ruder. Universal Language Model Fine-tuning for Text Classification Links to an external site.. Association for Computational Linguistics (ACL). 2018
- M. E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, L. Zettlemoyer. Deep contextualized word representations Links to an external site.. North American Association for Computational Linguistics (NAACL). 2018
- J. Devlin, M.W. Chang, K. Lee, K. Toutanova. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Links to an external site.. ArXiV. 2018
- J. Weston, A. Bordes, S. Chopra, A. M. Rush, B. van Merrienboer, A. Joulin and T. Mikolov. Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks Links to an external site.. International Conference on Learning Representations (ICLR). 2016
- X. Ling, S. Singh and D. Weld. Design Challenges for Entity Linking Links to an external site.. Transactions of the Association for Computational Linguistics (TACL). 2015
- K. Raghunathan, H. Lee, S. Rangarajan, N. Chambers, M. Surdeanu, D. Jurafsky and C. Manning. A Multi-Pass Sieve for Coreference Resolution Links to an external site.. Empirical Methods in Natural Language Processing (EMNLP). 2010
- G. Durrett and D. Klein. Easy Victories and Uphill Battles in Coreference Resolution Links to an external site.. Empirical Methods in Natural Language Processing (EMNLP). 2013
- I. Sutskever, O. Vinyals and Q. V. Le. Sequence to Sequence Learning with Neural Networks Links to an external site.. Neural Information Processing Systems (NIPS). 2014
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, I. Polosukhin. Attention Is All You Need! Links to an external site.. Neural Information Processing Systems (NIPS). 2017
- M. Seo, A. Kembhavi, A. Farhadi, H. Hajishirzi. Bidirectional Attention Flow for Machine Comprehension
Links to an external site.. International Conference on Learning Representations (ICLR). 2017
Software Resources
- AllenNLP Links to an external site.: Library for NLP research, built on top of PyTorch
- PyTorch Links to an external site.: library for neural networks Links to an external site.
- Tensorflow Links to an external site.: for neural networks and deep learning
- NLTK Links to an external site.: Python NLP toolkit
- Stanford CoreNLP Links to an external site., online demo Links to an external site.
- spaCy.io Links to an external site.: NLP toolkit
- Gensim Links to an external site.: for learning word embeddings
- Twitter API, tutorial Links to an external site.
- Wikipedia Extractor Links to an external site.: for extracting text from Wikipedia.
Datasets and Other Resources
- NewsQA Links to an external site.
- DeepMindQA Links to an external site.
- TrumpTwitter Links to an external site., to get the data, use something like this Links to an external site., years 2009-2016, a related page Links to an external site.
- Topic Models datasets Links to an external site.
- Wikipedia Text Links to an external site.
- Project Gutenberg Links to an external site.
- LOB corpus Links to an external site.
- British National Corpus Links to an external site.
- NLTK Data Links to an external site. (including Brown Corpus Links to an external site.)
- 1 Billion Word Language Model Links to an external site., Pretrained Google model Links to an external site.
- AlphaGo commentaries Links to an external site.
- Open data sets Links to an external site.
- Kaggle Links to an external site.
- Amazon AWS Links to an external site.
- Reddit Links to an external site.
- Word-emotion lexicon Links to an external site.
- StanfordQA dataset (SQuAD) Links to an external site.
- Fake News Challenge Links to an external site.
- Allen Institute for AI (AI2) Datasets Links to an external site.
- Open AI Resources Links to an external site.