Notebooks on Language: Patrick J. O'Donnell

Showing posts with label Patrick J. O'Donnell. Show all posts

Tuesday, October 1, 2013

Sereno, Brewer, and O'Donnell: "Context Effects in Word Recognition" (2003)

This paper is a mystery to me. It presents EEG evidence that ambiguous words are more difficult to process in a "biasing context," but it lumps together the figures for the sentences that primed the dominant of the word, and those for the sentences that primed the subordinate meaning.

I have no idea why anybody would ever want to do that, and it puzzles me even more since the two conditions seem to have been kept apart in the actual execution of the experiment.

Independent Variables

The materials used for the experiment are not reprinted in the paper except for the following twelve example sentences:

Word type	Context	Set	Example sentence
Ambiguous	Neutral	1	James peered over at the bank.
High-frequent	Neutral	1	She looked over the book.
Low-frequent	Neutral	1	To our surprise we saw a hawk.
Ambiguous	Neutral	2	The counted the number of feet.
High-frequent	Neutral	2	Sally knew about the drug.
Low-frequent	Neutral	2	They navigated through the cove.
Ambiguous	Biased	1	They measured in terms of feet.
High-frequent	Biased	1	The pharmacist distributed the drug.
Low-frequent	Biased	1	Pirates headed out to the cove.
Ambiguous	Biased	2	The mud was deep along the bank.
High-frequent	Biased	2	She read the new book.
Low-frequent	Biased	2	Flying to its nest was a hawk.

It would seem that the natural statistical tool for such a design would be a three-dimensional analysis -of-variance with 12 = 3 x 2 x 2 data cells; but as mentioned above, the authors seem to just throw the "Set" variable out the window for no particular reason (even though it seems like the most important one). Instead, they perform a two-dimensional analysis with 6 = 3 x 2 cells.

Dependent Variables

Even though the stuff that the authors are interested in is the amount of electrical activity at the scalp of the head, they actually consider two different dependent variables, both of them rather complicated. They come to the same conclusions in both cases.

The reason that they don't simply end up with a single number right away is that they used 129 electrodes and measured the electrical activity 256 times during each trial. This means that they start out with a huge amount of raw data, and they need some kind of dimensionality reduction to make sense of it.

To bring it down to edible size, they thus performed a principal component analysis; this is a method for bending and stretching the axes of the data space in such a way that all linear correlations between the dimensions disappear (as far as I understand). The problem of doing this turns out to be equivalent to finding the eigenvectors of a certain matrix; and once you have done it, the dominant eigenvector will then tell you how to reduce the data set into a single dimension in the most informative way.

They report that this principal component analysis was "spatial," and that they did it using a so-called quartimax rotation. I'm not quite sure what either of that entails exactly.

At any rate, the two dependent variables they consider are, if I'm not mistaken,

the amount of variance that the first (most important) component accounted for, i.e., the amount of variance in the dimension of this eigenvector relative to the total variance of the data set (measured in percent);
the mean value of the data when projected into this dimension (measured in microvolts).

How the former of these can be a negative number beats me (cf. their Fig. 1). But maybe I should just push some more big, red buttons and not worry so much.

Mud On the Bank

Here's how the authors sum up their results in the discussion section:

First, we found significant frequency effects in both neutral and biasing sentence contexts […] Second, although there was no context effect for HF [= high-frequent] words, LF [= low-frequent] words were (marginally) facilitated in a biasing context […] Finally, and critically, we examined context effects on ambiguous words. In a neutral context, ambiguous words behaved like HF words; in a biasing context, they behaved like LF words. A neutral context neither facilitated nor inhibited emergence of the dominant (HF) meaning, but a subordinate-biasing context selectively activated the subordinate (LF) meaning (the fate of the dominant meaning is less certain). We believe the pattern of results unambiguously establishes the existence of context effects very early on in the ERP record. (p. 331)

The fact that the ambiguous words behaved like the high-frequent unambiguous words should not be too surprising: the frequency of the high-frequent words were calibrated so as to match the dominant meaning of the ambiguous words as exactly as possible. So the key observation is that the ambiguous words "behaved like LF words" in the biased context.

This means that their ambiguous words required about as much mental effort to process on average as an unambiguous word with the same frequency as the least common meaning of the ambiguous word. But it seems to me that this piece of information is almost completely useless as long as these numbers are based on an average over both the sentences that were biased towards the dominant meaning and the sentences that biased towards the subordinate.

Thursday, March 14, 2013

Sereno, O'Donnell, and Rayner: "Eye Movements and Lexical Ambiguity Resolution" (2006)

In the literature on word comprehension, some studies have found that people usually take quite a long time looking at an ambiguous word if it occurs in a context that strongly favors one of its less frequent meanings.

This paper raises the issue of whether this is mainly because of clash between the high contextual fit and the low frequency, or mainly because of the frequency.

The Needle-in-a-Haystack Effect

A context preceding a word can either be neutral or biased, and a meaning of an ambiguous word can either be dominant (more frequent) or subordinate (less frequent). When a biased context favors the subordinate meaning, it is called a subordinate-biasing context.

The subordinate-bias effect is the phenomenon that people spend more time looking at an ambiguous word in a subordinate-biasing context than they take looking at an unambiguous word in the same context — given that the two words have the same frequency.

For instance, the word port can mean either "harbor" or "sweet wine," but the former is much more frequent than the latter. In this case, the subordinate-biasing effect is that people take longer to read the sentence

I decided to drink a glass of port

than the sentence

I decided to drink a glass of beer

This is true even though the words port and beer have almost equal frequencies (in the BNC, there are 3691 vs. 3179 occurrences of port vs. beer, respectively).

Balanced Meaning Frequencies = Balanced Reading Time

The question is whether these absolute word frequencies are the right thing to count, and Sereno, O'Donnell, and Rayner argue that they aren't. Instead, they suggest that it would be more fair to compare the sentence

I decided to drink a glass of port

to the sentence

I decided to drink a glass of rum

This is because port occurs in the meaning "sweet wine" approximately as often as the word rum occurs in absolute terms — i.e., much more rarely than beer. (A casual inspection of the frequencies of the phrases drink port/rum and a glass of port/rum seem to confirm the close match.)

What the Measurements Say

This means that you get three relevant conditions:

one in which the target word is ambiguous, and in which its intended meaning is not the most frequent one;
one in which the target word has the same absolute frequency as the ambiguous word;
and one in which the target word has the same absolute frequency as the intended meaning of the ambiguous word.

Each of these are then associated with an average reading time:

It's not like the effect is overwhelming, but here's what you see: The easiest thing to read is a high-frequent word with only a single meaning (middle row); the most difficult thing to read is a low-frequent word with only a single meaning (top row).

Between these two things in terms of reading time, you find the ambiguous word whose meaning was consistent with the context, but whose absolute frequency was higher.

Why are Ambiguous Words Easier?

In the conclusion of the paper, Sereno, O'Donnell, and Rayner speculate a bit about the possible causes of this "reverse subordinate-biasing effect," but they don't seem to find an explanation they are happy about (p. 345).

It seems to me that one would have to look closer at the sentences to find the correct answer. For instance, consider the following incomplete sentence:

She spent hours organizing the information on the computer into a _________

If you had to bet, how much money would you put on table, paper, and graph, respectively? If you would put more money on table than on graph, that probably also means that you were already anticipating seeing the word table in its "figure" meaning when your eyes reached the blank in the end of the sentence.

If people in general have such informed expectations, then that would explain why they are faster at retrieving the correct meaning of the anticipated word than they are at comprehending an unexpected word. But checking whether this is in fact the case would require a more careful information-theoretic study of the materials used in the experiment.

Subscribe to: Posts ( Atom )