Rebecca S. Colavin

I was born in the UK but grew up near Paris. I have a European degree in nursing and have translated a number of texts from English to French in a variety of subjects, not all of which are particularly respectable. I obtained a BA in linguistics from SDSU and completed course work for the MA in computational linguistics from the same institution under the direction of Rob Malouf and Marc Gawron. I was accepted into the doctoral program at UCSD for fall 2005.

Research Interests

Current work: It is well established that speakers prefer some invented words over others. My doctoral research investigates the relationship between the statistics of the lexicon and those speaker ratings. For example, we would predict, all else being equal, that as there are more words in the English lexicon that begin with “k” than “th”, English speakers would prefer invented words that begin with “k” over those that begin with “th”. I am particularly interested in how speakers rate words that contain sound sequences that do not occur in the lexicon (and for which they therefore have no direct evidence). I work primarily on Amharic, a Semitic language spoken in Ethiopia that presents a number of interesting challenges and my work encompasses both computational modeling using the excellent Maximum entropy phonotactic learner ( Hayes and Wilson 2008 ) and collecting Amharic speaker judgment data (in collaboration with Sharon Rose .)

I am a member of a number of research interest groups; SaDPhiG, the phonology group (directed by Sharon Rose and Eric Bakovic), the psycho-linguistics lab (director Roger Levy), the Language Evolution Complex Sytems group (led by Alex del Giudice) and the computational linguistics and discourse group (director Andy Kehler).

Previous work: My first comps paper (under the direction of Amalia Arvaniti) was an investigation of the production of epenthetic stops in sonorant/fricative clusters (that's when the word “prince” ends up sounding like “prints”). Subjects (Southern Californians) produced sentences in three different styles; casual speech, list style and forced contrast and we measured the length of the stop in two different ways; absolute length and the stop/word ratio. Our results showed that speakers reliably produced longer stops in words with a underlying stop (prints) than epenthetic (prince). Here is the abstract (pdf) and you can read the complete paper here (pdf). Be kind, it was my first paper!

Back burner: There are also a bunch of “I'll get to it sometime” projects. I have an abiding interest in statistical classifiers (in particular maxent models), and supervised learning. I would like to extend that experience by doing more unsupervised learning. I also have some experience with data-mining and I would welcome the opportunity to do more. More generally, I am interested in proto-roles, French back-slang, and object deletion in how-to genres. I may get around to doing something about these in the next century.

Presentations

Teaching

I have extensive teaching and TA experience. At SDSU, I taught Introduction to Linguistics. At UCSD, I have taught French in the LLP program and I have TAed undergraduate syntax, phonology, phonetics, morphology, Introduction to linguistics, Socio-linguistics and Languages of the Americas. Umm and maybe a couple that I have forgotten.