Advanced Probabilistic Modeling in R (2015 LSA Summer Institute second-session course)

1 Course information

Lecture Dates July 20, 23, 27, and 30 (Mondays and Thursdays)
Lecture Times 10:30am-12:20pm
Lecture Location Harper 140
Office Hours Tuesday 22 July 3:15-4:45pm; Friday 24 July 10:30am-noon; Tuesday 28 July 3:15-4:45pm (subject to change)
Office Hours Location Plein Air Cafe
Class webpage

2 Instructor information

Instructor Roger Levy (
Instructor Title Associate Professor, Department of Linguistics, University of California at San Diego

3 Course Description

Probabilistic models have thoroughly reshaped computational linguistics and continues to profoundly change other areas in the scientific study of language, ranging from psycholinguistics to syntax and phonology and even pragmatics and sociolinguistics. This change has included (a) qualitative improvements in our ability to analyze complex linguistic datasets and (b) new conceptualizations of language knowledge, acquisition, and use. For the most part, these changes have occurred in parallel, but the same theoretical toolkit underlies both advances. This course gives a concise introduction to this theoretical toolkit, covering the fundamentals of contemporary probabilistic models in the study of language. Examples from both data analysis and state-of-the-art probabilistic modeling of linguistic cognition are given, with key conceptual connections repeatedly drawn between the two. I also give pointers to publicly available software implementations and students will see simple examples of use that will allow them to replicate case studies covered in class.

The course will for the most part be taught out of a textbook-in-progress I am writing, Probabilistic Models in the Study of Language. You can always access the latest version here.

4 Course organization

Each lecture of the 4-day course will involve a combination of slides and boardwork. I strongly encourage question-asking and discussion in my lectures; please raise your hand and I'll call on you.

There is a mailing list that you can sign up for that I use to communicate with class participants.

5 Intended Audience

Researchers, postgraduate students, and highly motivated undergraduate students interested in probabilistic approaches to language. No prior exposure to probability theory or statistics is assumed, but we'll be using some high school calculus. For some parts of some lectures, participants will find basic familiarity with syntactic theory (e.g., context-free grammars) useful.

6 Syllabus (subject to modification)

There is a beginning-of-class survey that I'd appreciate it if you filled out, so that I can get more information about the backgrounds of class participants.

Day Topic Slides Readings Homework
Mon 20 July Essentials: Bayes nets, parameter estimation, hypothesis testing, confidence intervals. Lecture 1 PMSL Chapter 4; PMSL Chapter 5  
Thu 23 July Brief review of linear regression. Repeated-measures ANOVA. Introduction to mixed-effects models. Lecture 2 ( with builds) PMSL Chapter 8 Homework 1 (solutions)
Mon 27 July Mixed-effects models practicum I. How to keep it maximal. Lecture 3 ( with builds) Barr et al., 2013 Homework 2 (solutions)
Thu 30 July Mixed-effects models practicum II. R formula arcana. Beta-binomial regression.   Levy, 2014; Morgan & Levy, 2015  

Author: Roger Levy

Created: 2015-08-04 Tue 09:18

Emacs 24.4.1 (Org mode 8.2.5h)