Probabilistic Models in the Study of Language

I'm in the process of writing a textbook on the topic of using probabilistic models in scientific work on language ranging from experimental data analysis to corpus work to cognitive modeling. The intended audience is graduate students in linguistics, psychology, cognitive science, and computer science who are interested in using probabilistic models to study language. Feedback (both comments on existing drafts, and expressed desires for additional material to include!) is more than welcome -- send it to

Note that if you access these chapters repeatedly, you may need to clear the cache of your web browser to ensure that you're getting the latest version.

A current (partial) draft of the complete textis available here.

Here are drafts of those individual chapters that are already available:

  1. Introduction
  2. Univariate Probability, with R code
  3. Multivariate Probability, with R code
  4. Parameter Estimation, with R code
  5. Confidence Intervals and Hypothesis Testing, with R code
  6. Generalized Linear Models, with R code
  7. Interlude chapter (contents TBD)
  8. Hierarchical Models (a.k.a. multi-level, mixed-effects models), with R code
  9. Latent-Variable Models (partial draft), with R code
  10. Nonparametric Models
  11. Probabilistic Grammars


  1. Appendix: Mathematical notation and review
  2. Appendix: More probability Distributions
  3. Appendix: A brief introduction to directed graphical models
  4. A brief introduction to sampling techniques

Last modified: Wed Oct 3 12:03:18 PDT 2012