Human language processing relies on many kinds of linguistic knowledge, and is sensitive to their frequency, including lexical frequencies (Tyler, 1984; Salasoo & Pisoni, 1985; Marslen-Wilson, 1990; Zwitserlood, 1989; Simpson & Burgess, 1985), idiom frequencies (d'Arcais, 1993), phonological neighborhood frequencies (Luce, Pisoni, & Goldfinger, 1990), subcategorization frequencies (Trueswell, Tanenhaus, & Kello, 1993), and thematic role frequencies (Trueswell, Tanenhaus, & Garnsey, 1994; Garnsey, Pearlmutter, Myers, & Lotocky, 1997).
But while we know that each of these knowledge sources must be probabilistic, we know very little about exactly how these probabilistic knowledge sources are combined. This paper proposes the use of Bayesian decision trees in modeling the probabilistic, evidential nature of human sentence processing. Our method reifies conditional independence assertions implicit in
sign-based linguistic theories and describes interactions among features without requiring additional assumptions about modularity. We show that our Bayesian approach successfully models psycholinguistic results on evidence combination in human lexical, idiomatic, and syntactic/semantic processing.