Notebooks on Language: disambiguation

Showing posts with label disambiguation. Show all posts

Tuesday, October 1, 2013

Sereno, Brewer, and O'Donnell: "Context Effects in Word Recognition" (2003)

This paper is a mystery to me. It presents EEG evidence that ambiguous words are more difficult to process in a "biasing context," but it lumps together the figures for the sentences that primed the dominant of the word, and those for the sentences that primed the subordinate meaning.

I have no idea why anybody would ever want to do that, and it puzzles me even more since the two conditions seem to have been kept apart in the actual execution of the experiment.

Independent Variables

The materials used for the experiment are not reprinted in the paper except for the following twelve example sentences:

Word type	Context	Set	Example sentence
Ambiguous	Neutral	1	James peered over at the bank.
High-frequent	Neutral	1	She looked over the book.
Low-frequent	Neutral	1	To our surprise we saw a hawk.
Ambiguous	Neutral	2	The counted the number of feet.
High-frequent	Neutral	2	Sally knew about the drug.
Low-frequent	Neutral	2	They navigated through the cove.
Ambiguous	Biased	1	They measured in terms of feet.
High-frequent	Biased	1	The pharmacist distributed the drug.
Low-frequent	Biased	1	Pirates headed out to the cove.
Ambiguous	Biased	2	The mud was deep along the bank.
High-frequent	Biased	2	She read the new book.
Low-frequent	Biased	2	Flying to its nest was a hawk.

It would seem that the natural statistical tool for such a design would be a three-dimensional analysis -of-variance with 12 = 3 x 2 x 2 data cells; but as mentioned above, the authors seem to just throw the "Set" variable out the window for no particular reason (even though it seems like the most important one). Instead, they perform a two-dimensional analysis with 6 = 3 x 2 cells.

Dependent Variables

Even though the stuff that the authors are interested in is the amount of electrical activity at the scalp of the head, they actually consider two different dependent variables, both of them rather complicated. They come to the same conclusions in both cases.

The reason that they don't simply end up with a single number right away is that they used 129 electrodes and measured the electrical activity 256 times during each trial. This means that they start out with a huge amount of raw data, and they need some kind of dimensionality reduction to make sense of it.

To bring it down to edible size, they thus performed a principal component analysis; this is a method for bending and stretching the axes of the data space in such a way that all linear correlations between the dimensions disappear (as far as I understand). The problem of doing this turns out to be equivalent to finding the eigenvectors of a certain matrix; and once you have done it, the dominant eigenvector will then tell you how to reduce the data set into a single dimension in the most informative way.

They report that this principal component analysis was "spatial," and that they did it using a so-called quartimax rotation. I'm not quite sure what either of that entails exactly.

At any rate, the two dependent variables they consider are, if I'm not mistaken,

the amount of variance that the first (most important) component accounted for, i.e., the amount of variance in the dimension of this eigenvector relative to the total variance of the data set (measured in percent);
the mean value of the data when projected into this dimension (measured in microvolts).

How the former of these can be a negative number beats me (cf. their Fig. 1). But maybe I should just push some more big, red buttons and not worry so much.

Mud On the Bank

Here's how the authors sum up their results in the discussion section:

First, we found significant frequency effects in both neutral and biasing sentence contexts […] Second, although there was no context effect for HF [= high-frequent] words, LF [= low-frequent] words were (marginally) facilitated in a biasing context […] Finally, and critically, we examined context effects on ambiguous words. In a neutral context, ambiguous words behaved like HF words; in a biasing context, they behaved like LF words. A neutral context neither facilitated nor inhibited emergence of the dominant (HF) meaning, but a subordinate-biasing context selectively activated the subordinate (LF) meaning (the fate of the dominant meaning is less certain). We believe the pattern of results unambiguously establishes the existence of context effects very early on in the ERP record. (p. 331)

The fact that the ambiguous words behaved like the high-frequent unambiguous words should not be too surprising: the frequency of the high-frequent words were calibrated so as to match the dominant meaning of the ambiguous words as exactly as possible. So the key observation is that the ambiguous words "behaved like LF words" in the biased context.

This means that their ambiguous words required about as much mental effort to process on average as an unambiguous word with the same frequency as the least common meaning of the ambiguous word. But it seems to me that this piece of information is almost completely useless as long as these numbers are based on an average over both the sentences that were biased towards the dominant meaning and the sentences that biased towards the subordinate.

Tuesday, September 17, 2013

Geskell and Marslen-Wilson: "Lexical Ambiguity Resolution and Spoken Word Recognition" (2001)

If you pronounce the phrase

worn building

and you do this relatively quickly, there is a good chance that it comes out as something close to

['wɔɹm'bɪldɪŋ]

instead of

['wɔɹn'bɪldɪŋ].

This phenomenon is known as consonant assimilation. In this particular case, it happens because the peripheral consonant /b/ in the beginning of building makes it difficult to pronounce a coronal consonant such as the /n/ in worn, compared to another peripheral consonant such as the /m/ in warm. Your lips simply have to move less to say things like em-beh than to say things like en-beh.

The lip position used for an [m] is close to that used for a [b]
(picture from an online book by Michael Gasser)

In principle, this means that the sound string ['wɔɹm'bɪldɪŋ] involves an ambiguity for the hearer: Was the intended original message worn building or warm building? Both are plausible given the observed signal because they are both occasionally pronounced in the same way.

Assimilation effects can thus — in certain, relatively rare, cases — add another decoding problem to the already quite substantial ambiguity of words like warm.

Noisy Channel Hearing

The raises a question about the psycholinguistics of hearing: What would happen if we plugged a sound ambiguity like this into an experimental paradigm designed to track the process of word sense selection? Would people perhaps show traces of an active inference from sound to word, and competition between various hypotheses?

This is the question investigated by this paper by Gareth Gaskell and William Marslen-Wilson. Their idea is to play back an ambiguous sound string to their subjects, and then check if they are faster at recognizing a word on a computer screen if that word could have been the source of the sound string before consonant assimilation. For instance:

Voice: The ceremony was held in June and the sunny weather added to the air of celebration. An article about the bribe made the [Screen: bride] local paper.
Voice: The conditions in the outback were difficult for driving. In the intense heat, the mug cracked up [Screen: mud] completely.
Voice: We were impressed by her stylish delivery and intonation. Jane finished off the seam beautifully. [Screen: scene]

These test sentences are then compared to another condition in which the phonetics of the sentences do not warrant any backwards inference to a different sound form:

Voice: The ceremony was held in June and the sunny weather added to the air of celebration. An article about the bribe turned up [Screen: bride] in the local paper.
Voice: The conditions in the outback were difficult for driving. In the intense heat, the mug turned to [Screen: mud] dust.
Voice: We were impressed by her stylish delivery and intonation. Jane finished off the seam deftly. [Screen: scene]

These sentences cannot have come about by assimilation effects, so there is no basis for a reconstructive inference. For instance, bribe turned is not easier to pronounce than bride turned, so there is no reason to hypothesize that bribe as a distorted form of bride.

In the Face of Overwhelming Evidence

The main result of the whole paper is that there is indeed a significant difference between the cases where the phonological context supports an inference (e.g., mug cracked) and the cases where it doesn't (e.g., mug turned). This is, however, only the case if the discursive context also strongly suggests the same reconstructive inference (Experiment 3).

It's also worth noting that the effects are tiny. On average, subjects took 522 milliseconds to recognize the phonologically warranted form (e.g., mud from mug cracked) and 537 milliseconds to recognize the phonologically unwarranted (e.g. mud from mug turned). This is a difference of 15 milliseconds, or a drop of 2.8% in decision time. It's statistically significant, but it's not big.

They also found that if you remove the discursive bias from the materials, this effect disappears (Experiment 1 and 2). There is, for instance, no priming effect in the following sentence:

Voice: An article about the bribe turned up [Screen: bride] in the local paper.

It is thus only when both discursive and phonological context supports the inference that it leaves a measurable trace.

It is conceivable that there is an activation effect in the other case as well, but that it simply is so miniscule that we can't see it. But at any rate, this finding makes sense if we think about the inference as a kind of naive Bayes collection of evidence in favor of a hypothesis.

Friday, September 13, 2013

Klepousniotou: "The Processing of Lexical Ambiguity" (2001)

Following up on a related experiment by Lyn Frazier and Keith Rayner, this paper by Ekaterini Klepousniotou presents some evidence that not all types of ambiguity are equally hard to process.

Specifically, she reports a significant difference between, on one hand, the mass/count ambiguity of foodstuff words like potato, and, on the other hand, "deeper" ambiguities like that of fan ("enthusiast" vs. "air-blower").

Filed Down

These differences are found using a priming paradigm in which the subject must make a decision as to whether a string (e.g., prock) is a real English word or not. This task is posed immediately after the subject has read a sentence which primes one specific meaning of the term.

For instance, in the section of the experiment concerned with homonyms, you might for instance get one of the following prime–target pairs:

The carpenter smoothed the wood. — file
I have the papers in my office. — file

These two sentences differ as to whether they prime the less frequent or the more frequent meaning of the word file (i.e., "hand tool" or "dossier").

I am afraid Klepsousniotou's frequency calibration of the experiment was done on the basis of data from 1967. This may have had a quite important effect on the outcome of the experiment.

Somewhat surprisingly, Klepsousniotou reports that there was no significant difference between the priming of more frequent meanings and less frequent meanings (p. 213). This contradicts data from other experiments which were specifically designed to show that subordinate meanings are more difficult to access.

A Typology of Lemons

She does, however, find an on-average difference between ambiguities of the type above and ambiguities based around foodstuff metonymies (the result of what Langacker calls "grinding").

The following pair of prime–target pairs would thus both be quicker to process than the file examples above:

I baked one for lunch. — potato
I ate some mashed with gravy. — potato

And sure enough, the phrases a potato and some potato hardly seem to invoke different senses of the word potato. It would seem a bit weird to insist that the bare word, in the absence of any article, necessarily would have to be forced into the mold of a "mass noun" or a "count noun."

Mean reaction times for two ambiguity types and two target frequencies (cf. p. 213).

But anyway, now we have some empirical evidence that confirms this hypothesis: Subordinate meanings of strongly ambiguous words (e.g., spring = "source," punch = "drink," racket = "noise," etc.) are more difficult to settle on than mass readings of food words (e.g., some olive, some lemon, some rabbit, some cabbage, some apple, etc.).

Wednesday, September 11, 2013

Gunter, Wagner, and Friederici: "Working Memory and Lexical Ambiguity Resolution as Revealed by ERPs" (2003)

When a word is unexpected in its context, it often elicits a measurable N400 response. This is the case even if the unexpected word is not impossible in the context, only surprising. An example of such a sentence might possibly be

The plants flourished in the spring water.

But such shifts between hypotheses are difficult, and it would be reasonable to expect individual differences between how good people are at making such retrospective reinterpretations.

Somewhat surprisingly, this paper by Thomas C. Gunter, Susanne Wagner, and Angela Friederici presents some evidence that seems to suggest that people who score high on working memory tests tend to be worse at this kind of switching than people who score low.

Half Full or Half Empty?

The authors interpret this finding as evidence that the process of disambiguation is a matter of inhibition as opposed to activation.

But this seems rather nonsensical; there is no real-world difference between having too many feet and having too few shoes. Activating meaning A could just as well be described as inhibiting meaning B, unless some more precise anatomical theory is intended, and that doesn't seem to be the case in this paper.

But it seems fair enough to say that the test subjects who find the switching difficult must in some sense have been more "successful" in selecting a single hypothesis and getting rid of all the others. As far as I can see, the data in the paper tells us nothing about what the process behind this polarization is at the neurological level, only that it correlates positively with large working memory spans.

Materials

The materials that Gunter, Wagner, and Friederici used consisted in sentences with meaning oscillations of the following four kinds:

Der Ton wurde vom Sänger gesungen (dominant to dominant)
Der Ton wurde vom Töpfer gebrannt (subordinate to subordinate)
Der Ton wurde vom Sänger gebrannt (dominant to subordinate)
Der Ton wurde vom Töpfer gesungen (subordinate to dominant)

The most difficult switch is the one where the evidence first favors the dominant (more frequent) meaning, and then afterwards the subordinate (less frequent). Those are the cases that provoke larger N400 responses in people with long working memory spans.


A graph showing the averaged ERPs of the low-memory group (top) and the high-memory group (bottom) immediately after reading the disambiguating final verb (from experiment 1, p. 647).

This is not a matter of timing, the authors argue. Even when they added several intervening words, the effect persisted (experiments 2 and 3). It is thus not that the subjects with short working memory spans were slower at making a decision, but rather that they never made up their mind quite as forcefully and irreversibly as the subjects with long memory spans.

The authors report having used 88 different items in the experiment (p. 653), but they only quote two: The clay/tone example above, and the sentence

Der Ball wurde vom Spieler/Tänzer geworfen/eröffnet

which they use to explain their experimental paradigm throughout the paper.

Wednesday, July 24, 2013

Swinney: "Lexical Access during Sentence Comprehension" (1979)

In 1975, David Swinney and and David Hakes published a study which provided some evidence that irrelevant meanings of ambiguous words are never retrieved from memory if the context is strongly biased against them.

This prompted a reinterpretation of previous results showing the opposite, pointing to the strength of the context as the relevant independent variable.

The "On Hold" Paradigm

The paradigm they used in the 1975 paper was a phoneme recognition task. They played tape recordings of sentences to their subjects, asking them to push a button as soon as they heard a word beginning with a specific sound (say, /k/ as in cat).

This is more difficult and takes more time if the target comes immediately after an ambiguous word:

… he found several bugs in the corner of his room. (ambiguous)
… he found several insects in the corner of his room. (unambiguous)

However, the crucial manipulation Swinney and Hakes performed was to see whether this effect still held up when the context was strongly biased towards one of the two meanings of the word:

… he found several spiders, roaches, and other bugs in the corner of his room.

In the 1975 experiment, they found that it did not: Including a strongly disambiguating context effectively brought the reaction time down to the level of unambiguous words. This supported the hypothesis that meaning on the sentence level could affect lexical access.

The Multitasking Paradigm

In the 1979 paper, however, Swinney argued that this effect might only have occurred because of the relatively large time lag between prime and target (p. 647). He was thus interested in devising an experimental paradigm that could manipulate the width of this gap more directly.

The solution to this problem is a cross-modal priming design: The subject listens to the sentence being read aloud, but simultaneously has to solve a lexical decision task on a screen. This way, the target and prime can be timed relative to each other in any way you like.

So for example, you might be exposed to the following stimulus:

Voice: … he found several spiders, roaches, and other bugs [Screen: SPY] in the …

Your task is then to decide, as quickly as possible, whether the target word is an actual English word or not. After the experiment, you are also quizzed on the sentence in the headphone to make sure that you were listening (and not just focusing on the visual task).

The Vanishing Priming Effect

The results of the experiment can roughly be summarized as follows:

Delay	Appropriate	Inappropriate	Unrelated
None	Facilitation	Facilitation	No facilitation
Three syllables	Facilitation	No facilitation	No facilitation

An "appropriate" meaning is here one fits the meaning of the word as used in the sentence (e.g. roaches and bugs – ANT). An "inappropriate" one is a word that fits a different sense of the word (e.g. roaches and bugs – SPY). The "unrelated" is a real English word that doesn't have any specific relation to the prime (e.g., SEW).

The thing to notice about this table is the top middle cell: When there is no delay, even contextually inappropriate meanings are primed; however, after less than half a second, this effect has decayed to an insignificant level.

I remember reading in other texts that the exact time frame in which the priming effect is present is about 200 milliseconds. I don't remember where I picked up that number, though.

Tuesday, July 23, 2013

Hogaboam and Perfetti: "Lexical Ambiguity and Sentence Comprehension" (1975)

Along with the 1979 paper by David Swinney, this paper by Thomas Hogaboam and Charles Perfetti seems to be one of the most cited early papers on the psychology of ambiguity resolution. It is notable for pointing out the prominent function of word sense frequency.

The Language Machine

In a distinctly 1950s cognitive psychology style, the paper contrasts a number of hypotheses about ambiguity resolution, each formulated as a little block of virtual computer code:

[According to the prior decision model] the processes that provide access to lexical items may operate in such a way as to provide access only to the contextually correct meaning. (p. 265)

In [the exhaustive computation model] both meanings of an ambiguous word are accessed and further processed to determine which meaning is appropriate to the context. (p. 265)

[The one meaning hypothesis] holds that one meaning is accessed and checked against the context. If a match occurs the other meanings are not accessed, but if a match does not occur the process is repeated until a match is found. (p. 265)

When the "one meaning hypothesis" makes the additional assumption that the meanings are looked up in decreasing order of frequency (rather than randomly), it is called the "ordered search hypothesis":

In the ordered search model when an ambiguous word occurs in a sentence, an ordered lexical search takes place. The order of the search is determined by frequency of usage of the lexical entries, the most frequent being first. The search is self-terminating, so that as soon as an acceptable match occurs, no other lower entries will be checked, and all higher entries will have already been checked. (p. 266)

Since the prior decision model and the random-search version of the one meaning hypothesis are quickly dispatched, this leaves only the exhaustive computation hypothesis and the ordered search hypothesis in the field.

The Reduced Field

Hogaboam and Perfetti do, however, consider the option that exhaustive computation may occur differentiated in time. This leads in practice to the following set of competing hypotheses (p. 272):

The two leftmost drawings represent an exhaustive and an ordered search, respectively. The rightmost panel represents a modified version of the exhaustive search:

According to this model a search initiated at a token node activates both senses, but the primary sense becomes activated prior to the secondary sense. […] That is, both senses could be processed in parallel, but the primary sense may take less to to process and would thus be available for other processes sooner than the secondary sense. (p. 272)

So while parallel processing and time differences are in principle conceivable in the algorithmic world of Hogaboam and Perfetti, differences in activation are not: You either move stuff from the disk drive to the working memory, or you don't.

Serial or Skewed Parallel?

Interestingly, they do raise the concern that all of this speculation about mental algorithms may multiply entities beyond necessity. Specifically, there is a risk that the models only differ as to whether they hypothesize a computation in the working memory or in the long-term storage:

That is, under the conceptualization represented by the panel on the right, an ordering effect would be expected at a working memory level. This possibility indicates that contrasting the various models of the disambiguation process in effect may be setting up straw men. (p. 272)

However, their experiment (discussed below) does show observable differences between words, and these have to be ascribed some cause or other. From the perspective of 1975 cognitive science, the safest best seems to be word senses queuing up for serial processing:

For the present it is most parsimonious to propose, as a hypothesis, that the effect is a true order-of-processing effect and not an artifact of parallel processing. (p. 272)

So while they do briefly flirt with the idea of differentiated activation in a parallel network, they quickly return home to the comfortable world of Turing machines plodding from discrete state to discrete state.

The Big Deal

The experiment itself is set up as follows: A tape recording of a sentence is played to you, and you then have to decide whether its last word is ambiguous or not. If it is ambiguous, you must come up with an example of a different sense the word can be used in.

Here are some example sentences that can illustrate the task:

The antique typewriter was missing a letter.
The anti-pollution campaign created interest.
The gun collector displayed the arms.
The tired hiker rested his feet.

If you're like the average test subject, you should find this task more difficult for sentences 2 and 4, but easier for sentences 1 and 3.

This is because the words in the difficult sentences are used in their dominant meaning; you thus have to retrieve a relatively rare word sense in order to come up with a response. For the easy sentences, however, the word is used in a less frequent sense, and you can cite the frequent and readily available word sense as a response.

As indicated above, there is a number of ways that one can interpret this result. However, it should be clear that at the very least, it demonstrates that frequency plays a key role in word comprehension.

Sereno, Pacht, and Rayner: "The Effect of Meaning Frequency on Processing Lexically Ambiguous Words" (1992)

This experiment measured how long people took to read target words of the following kind:

The dinner party was proceeding smoothly when, just as Mary was serving the port, one of the guests had a heart attack. (Ambiguous word used in low-frequent sense)
The dinner party was proceeding smoothly when, just as Mary was serving the soup, one of the guests had a heart attack. (Unambiguous high-frequent word)
The dinner party was proceeding smoothly when, just as Mary was serving the veal, one of the guests had a heart attack. (Unambiguous low-frequent word)

The idea is here that the the two control words (soup and veal) are matched in frequency to the senses of the ambiguous word ("harbor" and "wine"). The word soup thus has approximately the same frequency as the "harbor" meaning of port, and the word veal approximately the same as the "wine" meaning.

Skipping to the conclusion, it then turns out that the ambiguous word (here, port) takes more time to process than either of the unambiguous words. That seems fairly natural, since a reader would not only have to remember the (infrequent) meaning of the word, but also subsequently resolve an ambiguity issue. This may take some time.

Four Stories About Comprehension

One can imagine the psychological process of reading and understanding an ambiguous snippet of text in several different ways. Sereno, Pacht, and Rayner cite four models in particular:

One model claims that "only the contextually appropriate meaning is activated in the lexicon," that is, context dictatorially determines access (p. 296).
Another claims that "all meanings are accessed automatically," that is, context has no influence at all (p. 296).
A third model liberally accepts that "access of the alternative meanings is influenced by the frequency of each meaning and also by the context" (p. 296).
Lastly, a fourth model claims that "the language processing mechanism automatically attempts to access all meanings of an ambiguous word in order of their frequency. […] Incomplete access procedures are terminated when one or more meanings that have been accessed are successfully integrated with prior context." (p. 296–97)

I have deliberately stripped the names off these hypotheses to avoid the implication that there is some precise, complex, and quantitative machinery hidden behind them. There is, in fact, just these verbal descriptions, and that's it.

The hypotheses are neither exhaustive nor exclusive. There is some textual clues that the authors regard models 3 and 4 as subspecies of model 2 but also some indications of the opposite.

The Stories and The Evidence

However, even at this level of resolution, the first hypothesis does not hold up to the evidence: If we were capable of discarding irrelevant word senses — instantly, before we had even looked them up — then the first model would predict equal reading times for the ambiguous and the low-frequent words. This is not the case.

The remaining models may or may not do better, depending on more specific assumptions.

The authors argue that the third model is consistent with their findings, since the longer fixations on ambiguous words can be explained by "competition between the dominant meaning and the subordinate" (p. 299). However, since we are now in the business of invoking auxiliary hypotheses, I don't see why this new concept of competition could not have been evoked to defend the first theory, too.

The fourth model also passes the test on the grounds that "the dominant sense is accessed first but is not successfully integrated because context supports the subordinate sense" (p. 299). This failed attempt at integration of word sense and context could then plausibly explain why the whole process takes longer than a simple look-up of the correct word sense.

Conclusions?

As I see it, the main lesson we can draw from this experiment is that ambiguity is costly. We can rephrase this message in terms of various informal "models" of the disambiguation process, but that doesn't really add much, to my mind. Only the models that were grotesque caricatures anyway can be excluded with any confidence.

But perhaps the first of the models cited above — the authors identify it as the "selective access model" but do not pin it to anybody in particular — had some problems to begin with. Specifically, how exactly would a person be able to discard the meaning of a word before retrieving it from memory? Without recall, there can't be any conflict, and hence no discarding.

I thus think that arguing against the selective access model is a bit of a windmill fight. On the most rigid reading of the slogan "only the contextually appropriate meaning is activated in the lexicon," the brain would literally need to be capable of time travel; on a more charitable reading, the theory is not necessarily inconsistent with the data at hand.

Rodd, Davis, and Johnsrude: "The Neural Mechanisms of Speech Comprehension" (2005)

This paper reports on two fMRI experiments which contrast ambiguous speech to unambiguous speech, and unambiguous speech to speech-like noise.

Blob One and Blob Two

By comparing pictures taken in each of these conditions, extracting significant differences, this design gives an indication of where in the head ambiguity is sorted out. The conclusion is that two areas in particular seemed to be disproportionally active when the experimental subjects listens to ambiguous speech:

The results of two fMRI experiments show that when volunteers listen to sentences that contain semantically ambiguous words, activity increases in both temporal and frontal brain regions. This confirms the involvement of these regions in the semantic aspects of sentence comprehension (i.e. activating, selecting or integrating word meanings). (p. 1266)

Roughly speaking, the areas in question were the bit of the brain behind the ears (on both sides of the head), and the bit behind the eyebrow (on the left side only).

Anatomical drawing of a brain from a 1918 textbook.
The parts of the brain discussed in the text are roughly located behind
the lower part of the temple and the lower left side of the forehead.

Decoding Efforts vs. Selection Efforts

The study did not include any distinctions more fine-grained than ambiguous/unambiguous. In particular, it did not contrasts skewed and balanced ambiguity; this is significant, since reading a word used in one of its less frequent meanings involves inhibition which may require cognitive effort.

Consider for instance the following example sentence from the paper:

the cymbals/symbols were making a racket/racquet

There are 67 results for cymbal(s) in the BNC; there are about 3000 for symbol(s). If I read this sentence aloud for you and took a picture of your brain while you listened, I would see a lot of activity in some regions; but this might be best interpreted as a trace of the force you exert in order to suppress the dominant but irrelevant meaning of the sound /ˈsɪmbəɫ/.

This process of suppressing a loud noise may or may not be a different from choosing between two competing alternatives, but we can't say on the basis of this experiment.

Wednesday, July 17, 2013

Martin et al.: "Strength of Discourse Context as a Determinant of the Subordinate Bias Effect" (1999)

As an argument in favor of selective-access over multiple-access views of ambiguity resolution, this paper provides evidence that the dominant meaning of a word may be eliminated from consideration if prior discourse is biased strongly enough.

In this way, Martin, Vu, Kellas, and Metcalf add yet another chapter to their already somewhat protracted and repetitive discussion with Binder and Rayner.

Tasks and Materials

The authors used two different methods to make their point: Self-paced reading and a naming task.

The naming task consists in reading a word aloud as fast as possible after having read a sentence. It thus allows one to measure priming effects from the sentence to the probe word.

The materials consisted in four different types of sentence, categorized on the basis of human offline judgments of bias. The categories, with examples, are:

Strongly favors the dominant meaning of a term:
The navigator dropped the compass. He searched the deck beneath his life boat.
Strongly favors the subordinate meaning of a term:
The gambler wanted an ace. He searched the deck for the marked cards.
Weakly favors the dominant meaning of a term:
The mother was in a hurry. She jammed the key while opening the door.
Weakly favors the subordinate meaning of a term:
The author was clumsy. She jammed the key while finishing the document.

Remember that the judges only took context preceding the target word into account for categorization. Disambiguating cues appearing later in the sentence were ignored.

As the examples indicate, the authors used two sentences per word, and both in the same strength category. I don't know why they didn't include four different sentences for each word; that would seem to be a more safe methodological bet.

Results

As indicated above, the result of the self-paced reading experiment was that the subjects spend the most time looking at words in subordinate when they occur in weakly biased sentences; in all other conditions, they were faster.

In absolute terms, the differences are not large. In the "easy" conditions, the subjects looked at the target words for about 345 ms on average. In the "hard" conditions, they looked at them for about 370 ms.

The difference is thus on the order of 25 ms or 7% additional looking time. Given the large number of subjects and trials, these effects are significant, but we're not talking about days and weeks here.

Some Quotes

Martin et al. present their own "context-sensitive model" as follows:

According to the context-sensitive model of ambiguity resolution (cf. Kellas, Paul, Martin, & Simpson, 1991; Paul et al., 1992; Simpson, 1994; Vu et al., 1998a, b) either meaning frequency or biasing context can dominate the resolution process dependent upon a third critical variable of contextual strength (i.e. the degree of constraint that context places on an ambiguous word). The bias of a context towards an ambiguous word can vary continuously, from weakly through strongly biased, as a function of the strength of constraints (e.g. syntax, semantics, pragmatics) that converge on the ambiguity. On the weak end of the continuum, word frequency information will dominate meaning computation, but at the opposite end strong contextual constraints will drive the computation process. For example, in the sentence Yesterday, the BANK [was eroded by the heavy rain], the context preceding bank does not sufficiently bias either sense of the homonym (i.e. financial institution or river). Consequently, meaning frequency dominates and the money sense of bank is the preferred interpretation. Consider, however, the sentence The heavy rain eroded the BANK yesterday. In this example, the context preceding bank strongly biases the river sense of the ambiguous word. (p. 815; emphases in original)

This is to be contrasted with a "reordered-access model" in which "all meanings are accessed in all contexts in order of meaning frequency," but in which the subordinate meanings can be moved up the ladder towards the dominant meaning, although not above them (cf. pp 814–15).

What's the Difference?

Both Binder and Rayner and Martin et al. seem to agree that the "reordered-access model" awards a higher weight to meaning frequencies than to contextual fit. They also seem to agree that the "context-sensitive model" does the opposite, or perhaps that it gives equal weight to these two statistics.

I don't see why that's necessarily the case; as described above, the reordered-access model amounts to nothing more than a search strategy — in particular, it does not specify a scoring function.

On the other hand, the context-sensitive model seems to be an informal description of a Bayesian inference. As such, it is perfectly consistent with the greedy search strategy postulated by "reordered access model," or with any other search strategy you desire.

I thus find it a little difficult to get my pulse up over this discussion. As long as the two models are as mathematically underspecified as they currently are, it seems to me that any prediction could be consistent with, or inconsistent with, either model. If the competing parties really wanted to flesh out their claims about the relative weights of priors and likelihoods, they should start picking some numbers.

Another way of putting the same point is that if these researchers really thought that they postulated different weights on priors and likelihoods, then they should also be able to agree on a sentence for which the two models would predict not only different reading times, but also different interpretations.

If no such sentences exists, the difference must solely pertain to the search strategy, and the context-sensitive model does not seem to specify any particular algorithm for this purpose.

Simpson: "Context and the Processing of Ambiguous Words" (1994)

In his chapter of the 1994 edition of the Handbook of Psycholinguistics, Greg B. Simpson discusses a large number of ambiguity resolution studies. His conclusion from reviewing these studies is that it is not clear whether sentence-level comprehension can feed back into lexical recall or not.

After stating this pessimistic conclusion, he writes:

In an earlier review, (Simpson, 1984), I tried to argue that the constellation of results at that time could be explained by positing a system whereby all meanings are activated, but with the degree of activation being sensitive to influence by the relative frequencies of the meaning and by the context in which the ambiguous word occurs. There does not seem to be any compelling reason to change that position now. (p. 367)

Later in the article, he illustrates the need for methodological caution with a contrast between two studies, that of Paul et al. (1992) and that of Till et al. (1988).

Both of these studies are about priming, and both of them show that a sentence as a whole can prime for things that its individual components cannot.

The two examples that Simpson discusses (p. 368ff) are:

Paul et al.: The boy dropped the plant (primes spill)
Till et al.: The old man sat with his head own and did not hear a word of the sermon during mass (primes sleep)

That seems reasonable — in fact, the first time I read that the Till et al. sentence primes sleep, I was for a second completely sure that the word had actually occurred explicitly in the sentence.

So far the studies are consistent. They differ on their story about the time course of these priming effects, however: Paul et al. found that the holistic prime quickly decayed and had vanished after a delay of 500ms. Till et al., on the other hand, found that the holistic priming effect only appeared after "long intervals" (but Simpson doesn't write how long; p. 369).

Monday, July 15, 2013

Hart and Perfetti: "Learning Words in Zekkish" (2008)

Because of the interesting quotes and references from Gerard Steen, I decided to take a look at some of the papers at Charles Perfetti's homepage. He has generously put many of them online and even scanned some book chapters.

Perfetti is mainly a scholar of the cognitive and developmental psychology of reading, so not everything he writes can be directly translated into the debates about semantics that I'm interested in. However, this book chapter, written with Lesley Hart, discusses disambiguation along with resolution of other input ambiguities (including the resolution of phonetically ambiguous words like content).

The upshot of the discussion is that disambiguation is a horse race process: When a phonologically unambiguous but semantically ambiguous word is read, a number of candidate meanings are activated, and a winner is then found by mutual inhibition:

With ambiguous words there is no phonological competition; however, the context and the extent of bias in the context in which the word appears (Vu, Kellas, Petersen & Metcalf, 2003), word-specific qualities such as word structure (Almeida & Libben, 2005), and the relative frequency of use of each of the meanings (Collins, 2002) remains necessary for choosing among multiple activated lexical entries. (p. 112)

They also briefly mention the surprisingly heated debate around the question of when in the process contextually irrelevant meanings are killed off:

There is some debate as to the power of biasing contexts to speed the response time for subordinate meanings. For example, can reading "bank" in a sentence like "There were crocodiles sunning themselves on the bank" improve response time times to the river meaning of bank more than sentences like "We took a picture of the bank," for which either meaning of "bank" can be appropriate? Martin, Vu, Kellas, and Metcalf (1999) claim that strongly biasing context can override word meaning biases from frequency, whereas weakly biasing context cannot. Binder & Rayner disagree with the power of strongly biasing context; they find that strongly biasing context does not have enough strength to incase the activation rate of lower frequency word meanings. (p. 113)

As a side note, I'm not sure the first reference here is to the right paper: George Kellas and Hoang Vu wrote a similarly-titled paper which was a direct response to Katherine S. Binder and Keith Rayner's paper, so maybe that's what Hart and Perfetti in fact wanted to cite.

Thursday, March 14, 2013

Sereno, O'Donnell, and Rayner: "Eye Movements and Lexical Ambiguity Resolution" (2006)

In the literature on word comprehension, some studies have found that people usually take quite a long time looking at an ambiguous word if it occurs in a context that strongly favors one of its less frequent meanings.

This paper raises the issue of whether this is mainly because of clash between the high contextual fit and the low frequency, or mainly because of the frequency.

The Needle-in-a-Haystack Effect

A context preceding a word can either be neutral or biased, and a meaning of an ambiguous word can either be dominant (more frequent) or subordinate (less frequent). When a biased context favors the subordinate meaning, it is called a subordinate-biasing context.

The subordinate-bias effect is the phenomenon that people spend more time looking at an ambiguous word in a subordinate-biasing context than they take looking at an unambiguous word in the same context — given that the two words have the same frequency.

For instance, the word port can mean either "harbor" or "sweet wine," but the former is much more frequent than the latter. In this case, the subordinate-biasing effect is that people take longer to read the sentence

I decided to drink a glass of port

than the sentence

I decided to drink a glass of beer

This is true even though the words port and beer have almost equal frequencies (in the BNC, there are 3691 vs. 3179 occurrences of port vs. beer, respectively).

Balanced Meaning Frequencies = Balanced Reading Time

The question is whether these absolute word frequencies are the right thing to count, and Sereno, O'Donnell, and Rayner argue that they aren't. Instead, they suggest that it would be more fair to compare the sentence

I decided to drink a glass of port

to the sentence

I decided to drink a glass of rum

This is because port occurs in the meaning "sweet wine" approximately as often as the word rum occurs in absolute terms — i.e., much more rarely than beer. (A casual inspection of the frequencies of the phrases drink port/rum and a glass of port/rum seem to confirm the close match.)

What the Measurements Say

This means that you get three relevant conditions:

one in which the target word is ambiguous, and in which its intended meaning is not the most frequent one;
one in which the target word has the same absolute frequency as the ambiguous word;
and one in which the target word has the same absolute frequency as the intended meaning of the ambiguous word.

Each of these are then associated with an average reading time:

It's not like the effect is overwhelming, but here's what you see: The easiest thing to read is a high-frequent word with only a single meaning (middle row); the most difficult thing to read is a low-frequent word with only a single meaning (top row).

Between these two things in terms of reading time, you find the ambiguous word whose meaning was consistent with the context, but whose absolute frequency was higher.

Why are Ambiguous Words Easier?

In the conclusion of the paper, Sereno, O'Donnell, and Rayner speculate a bit about the possible causes of this "reverse subordinate-biasing effect," but they don't seem to find an explanation they are happy about (p. 345).

It seems to me that one would have to look closer at the sentences to find the correct answer. For instance, consider the following incomplete sentence:

She spent hours organizing the information on the computer into a _________

If you had to bet, how much money would you put on table, paper, and graph, respectively? If you would put more money on table than on graph, that probably also means that you were already anticipating seeing the word table in its "figure" meaning when your eyes reached the blank in the end of the sentence.

If people in general have such informed expectations, then that would explain why they are faster at retrieving the correct meaning of the anticipated word than they are at comprehending an unexpected word. But checking whether this is in fact the case would require a more careful information-theoretic study of the materials used in the experiment.

Subscribe to: Posts ( Atom )