| Lecture Times | WF 12:45-2pm (first meeting Wed 9 Jan 12:30pm) |
| Lecture Location | AP&M 4301 |
| Class webpage | http://grammar.ucsd.edu/courses/lign256/ |
| Instructor | Roger Levy (rlevy@ling.ucsd.edu) |
| Instructor's office | AP&M 4220 |
| Instructor's office hours | WF 2-3pm |
The goal of this course is to train you, the students, to do research in natural language processing — work that can potentially be published in the leading conferences and journals of the field. In addition to helping you succeed academically in this field (and related fields including AI, machine learning, and psycholinguistics), this is also great training if you are interested in doing NLP work in industry, either in a research lab (Google, Microsoft, Powerset, Yahoo, etc.) or in a startup.
Graduate students in linguistics, computer science, engineering, cognitive science, psychology, and any other discipline who are interested in how to process natural language by computer. Highly motivated undergraduates are also welcome, but please talk to the instructor before enrolling.
We're going to be using the two premier textbooks in the field for this course:
There will also be at least one reading from the following new textbook:
We may occasionally read recent papers from the literature as well.
Working in natural language processing requires putting several different types of skills together:
You may not have all of these skills yet, but hopefully you have a substantial subset of them. It may require a bit of extra work for you to strengthen your background in any area where you're deficient — the focus within the class will be on how to put them together.
The mailing list for the class is ligncse256@ling.ucsd.edu. Sign up for the mailing list here.
| Week | Day | Topic | Readings | Materials | Homework Assignments |
|---|---|---|---|---|---|
| Week 1 | 9 Jan | Class Introduction | M&S 1, 2 | Lecture 1 [PDF] | |
| 11 Jan | Language Modeling I | M&S Chapter 6, J&M Chapter 4 | Lecture 2 [PDF] | ||
| Week 2 | 16 Jan | Language Modeling II | Chen and Goodman 1998 (an absolute classic) | Lecture 3 [PDF]; Kneser-Ney mini-example | Programming Assignment 1 (due 1 Feb) |
| 18 Jan | Text Categorization | MRS 2008, Chapter 13 | Lecture 4 [PDF] | ||
| Week 3 | 23 Jan | Word-sense Disambiguation | M&S Chapter 7, J&M Chapter 20 | Lecture 5 [PDF] | |
| 25 Jan | Part-of-speech Tagging I | M&S Chapter 9, J&M Chapter 6 | Lecture 6 [PDF]; HMM Viterbi inference mini-example | ||
| Week 4 | 30 Jan | Part-of-speech Tagging II | M&S Chapter 10 | Programming Assignment 2 (due 15 Feb) | |
| 1 Feb | Syntax | M&S Chapter 10, J&M Chapter 12 | Lecture 8 | ||
| Week 5 | 6 Feb | Roger out of town: no class | |||
| 8 Feb | Syntactic disambiguation | none — catch up! | Lecture 9 [PDF] | ||
| Week 6 | 13 Feb | Parsing I | M&S Chapter 11 | Lecture 10 [PDF] | Final project guidelines go out |
| 15 Feb | Parsing II | M&S Chapter 12 | Lecture 11 [PDF] | Programming Assignment 3 (due 29 Feb) | |
| Week 7 | 20 Feb | Frame Semantics/Semantic Roles | J&M Chapter 19 | Lecture 12 [PDF] | |
| 22 Feb | Compositional Semantics | J&M Chapter 18, handout | [PDF] | You should show me a draft final project proposal by this point | |
| Week 8 | 27 Feb | Class cancelled — Roger ill | |||
| 29 Feb | Discourse Processing | J&M Chapter 21 | Lecture 14 [PDF] | Final Project Guidelines; Programming Assignment 3 (either-or!) | |
| Week 9 | 5 Mar | Computational Psycholinguistics | Hale 2001 | Lecture 15 PDF | |
| 7 Mar | Unsupervised Learning I: Word segmentation, POS clustering | Goldwater et al. 2006, Clark 2000 | Lecture 16 [PDF] | ||
| Week 10 | 12 Mar | Unsupervised Learning II: Syntactic acquisition | Klein & Manning 2002, 2004 | ||
| 14 Mar | No class (Roger out of town) — work on final projects! | ||||
| Finals | 20 Mar | Final Projects Due |
Your grade will be based on the following criteria:
Collaboration is encouraged for homework assignments and final projects, but you must be explicit about who you collaborated with and what the division of labor was.
Computational linguistics/NLP is a very conference-oriented field; many of the classic articles in the literature never wind up getting published in journals. The top conferences include:
There are also some excellent workshops and conferences run regularly by "special interest groups", including
and others. Finally, excellent work in computational linguistics/NLP also appears in machine learning, artificial intelligence, and other conferences, notably the Conference on Neural Information Processing Systems (NIPS).
The flagship and leading journal of the field is Computational Linguistics. Other excellent journals in the field include: