Thursday, July 25, 2013

Nonce words from Klepousniotou (2001)

It's often difficult to get your hands on the experimental materials that psychologists use, especially when they're difficult to create. This goes especially for the lists of phonologically permissible non-words that are used in lexical decision tasks.

However, in a paper from 2001, Ekaterini Klepousniotouhashas reprinted the list nonce words that she used in her experiment. For handy future reference, here's her list (pp. 219–221):
tanel, scling, kunch, sile, nacket, foach, zan, crant, puit, dace, nank, vap, maist, zail, iye, neg, reck, kip, mongue, ping, orive, rotato, nemon, labbit, rabbage, affle, sarrot, otion, omange, nomato, jotel, ardar, phogo, faln, ubit, leamber, stoge, cagon, ampel, calern, fike, piddar, trage, napion, paggern, mirgake, pelton, wame, wearon, reafon, tuge, stument, cory, prile, ripal, tave, roke, togic, ebergy, folbune, digorce, draze, linerty, snate, shabe, sagary, hode, clikate, cemper, lige, galben, zold, gort, ceal, brug, gatch, bap, nall, dafe, pake, feck, sote, douth, nofe, shourber, wiggow, zear, sig, frum, ablicot, reek, neach, arkond, cujumber, blail, glick, viodin, chidel, modey, spirach, cebery, drace, goice, blaffic, subar, yope, baple, proot, pladow, zipe, plice, loat, gud, vind, gricken, prock, bope, gair, nalt, pable.
From a completely unqualified impressionistic perspective, they seem to be quite qualitatively different.

For instance, roke and pable are plausible non-words, but do not strongly resemble any other English words in particular. On the other hand, rotato, nemon, jotel, sarrot, digorce, and sig seem to do.

I don't know why reek is on this list, either. That seems to be an obvious blunder.

Also, loat is only a graphical non-word: Phonetically, it is identical to load, which is a bit unfortunate.

Wednesday, July 24, 2013

Barclay et al.: "Comprehension and Semantic Flexibility" (1974)

This paper reports on a number of experiments which aim at showing that people highlight specific properties of a noun when they hear it in a sentence. Reading the sentences He tuned the piano and He lifted the piano, you thus form two different representations (or "encodings") of the concept "piano" in working memory.

The paper uses four different cued recall experiments to investigate this effect:
  1. Sentence objects are recalled based on feature cues: Something heavy cues He lifted the piano.
  2. Sentence objects are recalled based on feature cues, but now with control sentences that differ only on the objects: Something heavy cues He lifted the infant.
  3. Whole sentences are recalled based on object-feature cues: Pianos are heavy cues He lifted the piano.
  4. Whole sentences are recalled based on object-feature cues as well as object cues: Piano cues He lifted the piano.
The materials are not reprinted in the paper.

Swinney: "Lexical Access during Sentence Comprehension" (1979)

In 1975, David Swinney and and David Hakes published a study which provided some evidence that irrelevant meanings of ambiguous words are never retrieved from memory if the context is strongly biased against them.

This prompted a reinterpretation of previous results showing the opposite, pointing to the strength of the context as the relevant independent variable.

The "On Hold" Paradigm

The paradigm they used in the 1975 paper was a phoneme recognition task. They played tape recordings of sentences to their subjects, asking them to push a button as soon as they heard a word beginning with a specific sound (say, /k/ as in cat).

This is more difficult and takes more time if the target comes immediately after an ambiguous word:
  • … he found several bugs in the corner of his room. (ambiguous)
  • … he found several insects in the corner of his room. (unambiguous)
However, the crucial manipulation Swinney and Hakes performed was to see whether this effect still held up when the context was strongly biased towards one of the two meanings of the word:
  • … he found several spiders, roaches, and other bugs in the corner of his room.
In the 1975 experiment, they found that it did not: Including a strongly disambiguating context effectively brought the reaction time down to the level of unambiguous words. This supported the hypothesis that meaning on the sentence level could affect lexical access.

The Multitasking Paradigm

In the 1979 paper, however, Swinney argued that this effect might only have occurred because of the relatively large time lag between prime and target (p. 647). He was thus interested in devising an experimental paradigm that could manipulate the width of this gap more directly.

The solution to this problem is a cross-modal priming design: The subject listens to the sentence being read aloud, but simultaneously has to solve a lexical decision task on a screen. This way, the target and prime can be timed relative to each other in any way you like.

So for example, you might be exposed to the following stimulus:
  • Voice: … he found several spiders, roaches, and other bugs [Screen: SPY] in the …
Your task is then to decide, as quickly as possible, whether the target word is an actual English word or not. After the experiment, you are also quizzed on the sentence in the headphone to make sure that you were listening (and not just focusing on the visual task).

The Vanishing Priming Effect

The results of the experiment can roughly be summarized as follows:
Delay
Appropriate
Inappropriate
Unrelated
None
Facilitation
Facilitation
No facilitation
Three syllables
Facilitation
No facilitation
No facilitation
An "appropriate" meaning is here one fits the meaning of the word as used in the sentence (e.g. roaches and bugs – ANT). An "inappropriate" one is a word that fits a different sense of the word (e.g. roaches and bugs – SPY). The "unrelated" is a real English word that doesn't have any specific relation to the prime (e.g., SEW).

The thing to notice about this table is the top middle cell: When there is no delay, even contextually inappropriate meanings are primed; however, after less than half a second, this effect has decayed to an insignificant level.

I remember reading in other texts that the exact time frame in which the priming effect is present is about 200 milliseconds. I don't remember where I picked up that number, though.

Tuesday, July 23, 2013

Hogaboam and Perfetti: "Lexical Ambiguity and Sentence Comprehension" (1975)

Along with the 1979 paper by David Swinney, this paper by Thomas Hogaboam and Charles Perfetti seems to be one of the most cited early papers on the psychology of ambiguity resolution. It is notable for pointing out the prominent function of word sense frequency.

The Language Machine

In a distinctly 1950s cognitive psychology style, the paper contrasts a number of hypotheses about ambiguity resolution, each formulated as a little block of virtual computer code:
[According to the prior decision model] the processes that provide access to lexical items may operate in such a way as to provide access only to the contextually correct meaning. (p. 265)
In [the exhaustive computation model] both meanings of an ambiguous word are accessed and further processed to determine which meaning is appropriate to the context. (p. 265)
[The one meaning hypothesis] holds that one meaning is accessed and checked against the context. If a match occurs the other meanings are not accessed, but if a match does not occur the process is repeated until a match is found. (p. 265)
When the "one meaning hypothesis" makes the additional assumption that the meanings are looked up in decreasing order of frequency (rather than randomly), it is called the "ordered search hypothesis":
In the ordered search model when an ambiguous word occurs in a sentence, an ordered lexical search takes place. The order of the search is determined by frequency of usage of the lexical entries, the most frequent being first. The search is self-terminating, so that as soon as an acceptable match occurs, no other lower entries will be checked, and all higher entries will have already been checked. (p. 266)
Since the prior decision model and the random-search version of the one meaning hypothesis are quickly dispatched, this leaves only the exhaustive computation hypothesis and the ordered search hypothesis in the field.

The Reduced Field

Hogaboam and Perfetti do, however, consider the option that exhaustive computation may occur differentiated in time. This leads in practice to the following set of competing hypotheses (p. 272):


The two leftmost drawings represent an exhaustive and an ordered search, respectively. The rightmost panel represents a modified version of the exhaustive search:
According to this model a search initiated at a token node activates both senses, but the primary sense becomes activated prior to the secondary sense. […] That is, both senses could be processed in parallel, but the primary sense may take less to to process and would thus be available for other processes sooner than the secondary sense. (p. 272)
So while parallel processing and time differences are in principle conceivable in the algorithmic world of Hogaboam and Perfetti, differences in activation are not: You either move stuff from the disk drive to the working memory, or you don't.

Serial or Skewed Parallel?

Interestingly, they do raise the concern that all of this speculation about mental algorithms may multiply entities beyond necessity. Specifically, there is a risk that the models only differ as to whether they hypothesize a computation in the working memory or in the long-term storage:
That is, under the conceptualization represented by the panel on the right, an ordering effect would be expected at a working memory level. This possibility indicates that contrasting the various models of the disambiguation process in effect may be setting up straw men. (p. 272)
However, their experiment (discussed below) does show observable differences between words, and these have to be ascribed some cause or other. From the perspective of 1975 cognitive science, the safest best seems to be word senses queuing up for serial processing:
For the present it is most parsimonious to propose, as a hypothesis, that the effect is a true order-of-processing effect and not an artifact of parallel processing. (p. 272)
So while they do briefly flirt with the idea of differentiated activation in a parallel network, they quickly return home to the comfortable world of Turing machines plodding from discrete state to discrete state.

The Big Deal

The experiment itself is set up as follows: A tape recording of a sentence is played to you, and you then have to decide whether its last word is ambiguous or not. If it is ambiguous, you must come up with an example of a different sense the word can be used in.

Here are some example sentences that can illustrate the task:
  1. The antique typewriter was missing a letter.
  2. The anti-pollution campaign created interest.
  3. The gun collector displayed the arms.
  4. The tired hiker rested his feet.
If you're like the average test subject, you should find this task more difficult for sentences 2 and 4, but easier for sentences 1 and 3.

This is because the words in the difficult sentences are used in their dominant meaning; you thus have to retrieve a relatively rare word sense in order to come up with a response. For the easy sentences, however, the word is used in a less frequent sense, and you can cite the frequent and readily available word sense as a response.

As indicated above, there is a number of ways that one can interpret this result. However, it should be clear that at the very least, it demonstrates that frequency plays a key role in word comprehension.

Sereno, Pacht, and Rayner: "The Effect of Meaning Frequency on Processing Lexically Ambiguous Words" (1992)

This experiment measured how long people took to read target words of the following kind:
  • The dinner party was proceeding smoothly when, just as Mary was serving the port, one of the guests had a heart attack. (Ambiguous word used in low-frequent sense)
  • The dinner party was proceeding smoothly when, just as Mary was serving the soup, one of the guests had a heart attack. (Unambiguous high-frequent word)
  • The dinner party was proceeding smoothly when, just as Mary was serving the veal, one of the guests had a heart attack. (Unambiguous low-frequent word)
The idea is here that the the two control words (soup and veal) are matched in frequency to the senses of the ambiguous word ("harbor" and "wine"). The word soup thus has approximately the same frequency as the "harbor" meaning of port, and the word veal approximately the same as the "wine" meaning.

Skipping to the conclusion, it then turns out that the ambiguous word (here, port) takes more time to process than either of the unambiguous words. That seems fairly natural, since a reader would not only have to remember the (infrequent) meaning of the word, but also subsequently resolve an ambiguity issue. This may take some time.

Four Stories About Comprehension

One can imagine the psychological process of reading and understanding an ambiguous snippet of text in several different ways. Sereno, Pacht, and Rayner cite four models in particular:
  1. One model claims that "only the contextually appropriate meaning is activated in the lexicon," that is, context dictatorially determines access (p. 296).
  2. Another claims that "all meanings are accessed automatically," that is, context has no influence at all (p. 296).
  3. A third model liberally accepts that "access of the alternative meanings is influenced by the frequency of each meaning and also by the context" (p. 296).
  4. Lastly, a fourth model claims that "the language processing mechanism automatically attempts to access all meanings of an ambiguous word in order of their frequency. […] Incomplete access procedures are terminated when one or more meanings that have been accessed are successfully integrated with prior context." (p. 296–97)
I have deliberately stripped the names off these hypotheses to avoid the implication that there is some precise, complex, and quantitative machinery hidden behind them. There is, in fact, just these verbal descriptions, and that's it.

The hypotheses are neither exhaustive nor exclusive. There is some textual clues that the authors regard models 3 and 4 as subspecies of model 2 but also some indications of the opposite.

The Stories and The Evidence

However, even at this level of resolution, the first hypothesis does not hold up to the evidence: If we were capable of discarding irrelevant word senses — instantly, before we had even looked them up — then the first model would predict equal reading times for the ambiguous and the low-frequent words. This is not the case.

The remaining models may or may not do better, depending on more specific assumptions.

The authors argue that the third model is consistent with their findings, since the longer fixations on ambiguous words can be explained by "competition between the dominant meaning and the subordinate" (p. 299). However, since we are now in the business of invoking auxiliary hypotheses, I don't see why this new concept of competition could not have been evoked to defend the first theory, too.

The fourth model also passes the test on the grounds that "the dominant sense is accessed first but is not successfully integrated because context supports the subordinate sense" (p. 299). This failed attempt at integration of word sense and context could then plausibly explain why the whole process takes longer than a simple look-up of the correct word sense.

Conclusions?

As I see it, the main lesson we can draw from this experiment is that ambiguity is costly. We can rephrase this message in terms of various informal "models" of the disambiguation process, but that doesn't really add much, to my mind. Only the models that were grotesque caricatures anyway can be excluded with any confidence.

But perhaps the first of the models cited above — the authors identify it as the "selective access model" but do not pin it to anybody in particular — had some problems to begin with. Specifically, how exactly would a person be able to discard the meaning of a word before retrieving it from memory? Without recall, there can't be any conflict, and hence no discarding.

I thus think that arguing against the selective access model is a bit of a windmill fight. On the most rigid reading of the slogan "only the contextually appropriate meaning is activated in the lexicon," the brain would literally need to be capable of time travel; on a more charitable reading, the theory is not necessarily inconsistent with the data at hand.

Rodd, Davis, and Johnsrude: "The Neural Mechanisms of Speech Comprehension" (2005)

This paper reports on two fMRI experiments which contrast ambiguous speech to unambiguous speech, and unambiguous speech to speech-like noise.

Blob One and Blob Two

By comparing pictures taken in each of these conditions, extracting significant differences, this design gives an indication of where in the head ambiguity is sorted out. The conclusion is that two areas in particular seemed to be disproportionally active when the experimental subjects listens to ambiguous speech:
The results of two fMRI experiments show that when volunteers listen to sentences that contain semantically ambiguous words, activity increases in both temporal and frontal brain regions. This confirms the involvement of these regions in the semantic aspects of sentence comprehension (i.e. activating, selecting or integrating word meanings). (p. 1266)
Roughly speaking, the areas in question were the bit of the brain behind the ears (on both sides of the head), and the bit behind the eyebrow (on the left side only).


Anatomical drawing of a brain from a 1918 textbook.
The parts of the brain discussed in the text are roughly located behind
the lower part of the temple and the lower left side of the forehead.


Decoding Efforts vs. Selection Efforts

The study did not include any distinctions more fine-grained than ambiguous/unambiguous. In particular, it did not contrasts skewed and balanced ambiguity; this is significant, since reading a word used in one of its less frequent meanings involves inhibition which may require cognitive effort.

Consider for instance the following example sentence from the paper:
  • the cymbals/symbols were making a racket/racquet
There are 67 results for cymbal(s) in the BNC; there are about 3000 for symbol(s). If I read this sentence aloud for you and took a picture of your brain while you listened, I would see a lot of activity in some regions; but this might be best interpreted as a trace of the force you exert in order to suppress the dominant but irrelevant meaning of the sound /ˈsɪmbəɫ/.

This process of suppressing a loud noise may or may not be a different from choosing between two competing alternatives, but we can't say on the basis of this experiment.

Wednesday, July 17, 2013

Martin et al.: "Strength of Discourse Context as a Determinant of the Subordinate Bias Effect" (1999)

As an argument in favor of selective-access over multiple-access views of ambiguity resolution, this paper provides evidence that the dominant meaning of a word may be eliminated from consideration if prior discourse is biased strongly enough.

In this way, Martin, Vu, Kellas, and Metcalf add yet another chapter to their already somewhat protracted and repetitive discussion with Binder and Rayner.

Tasks and Materials

The authors used two different methods to make their point: Self-paced reading and a naming task.

The naming task consists in reading a word aloud as fast as possible after having read a sentence. It thus allows one to measure priming effects from the sentence to the probe word.

The materials consisted in four different types of sentence, categorized on the basis of human offline judgments of bias. The categories, with examples, are:
  1. Strongly favors the dominant meaning of a term:
    The navigator dropped the compass. He searched the deck beneath his life boat.
  2. Strongly favors the subordinate meaning of a term:
    The gambler wanted an ace. He searched the deck for the marked cards.
  3. Weakly favors the dominant meaning of a term:
    The mother was in a hurry. She jammed the key while opening the door.
  4. Weakly favors the subordinate meaning of a term:
    The author was clumsy. She jammed the key while finishing the document.
Remember that the judges only took context preceding the target word into account for categorization. Disambiguating cues appearing later in the sentence were ignored.

As the examples indicate, the authors used two sentences per word, and both in the same strength category. I don't know why they didn't include four different sentences for each word; that would seem to be a more safe methodological bet.

Results

As indicated above, the result of the self-paced reading experiment was that the subjects spend the most time looking at words in subordinate when they occur in weakly biased sentences; in all other conditions, they were faster.

In absolute terms, the differences are not large. In the "easy" conditions, the subjects looked at the target words for about 345 ms on average. In the "hard" conditions, they looked at them for about 370 ms.

The difference is thus on the order of 25 ms or 7% additional looking time. Given the large number of subjects and trials, these effects are significant, but we're not talking about days and weeks here.

Some Quotes

Martin et al. present their own "context-sensitive model" as follows:
According to the context-sensitive model of ambiguity resolution (cf. Kellas, Paul, Martin, & Simpson, 1991; Paul et al., 1992; Simpson, 1994; Vu et al., 1998a, b) either meaning frequency or biasing context can dominate the resolution process dependent upon a third critical variable of contextual strength (i.e. the degree of constraint that context places on an ambiguous word). The bias of a context towards an ambiguous word can vary continuously, from weakly through strongly biased, as a function of the strength of constraints (e.g. syntax, semantics, pragmatics) that converge on the ambiguity. On the weak end of the continuum, word frequency information will dominate meaning computation, but at the opposite end strong contextual constraints will drive the computation process. For example, in the sentence Yesterday, the BANK [was eroded by the heavy rain], the context preceding bank does not sufficiently bias either sense of the homonym (i.e. financial institution or river). Consequently, meaning frequency dominates and the money sense of bank is the preferred interpretation. Consider, however, the sentence The heavy rain eroded the BANK yesterday. In this example, the context preceding bank strongly biases the river sense of the ambiguous word. (p. 815; emphases in original)
This is to be contrasted with a "reordered-access model" in which "all meanings are accessed in all contexts in order of meaning frequency," but in which the subordinate meanings can be moved up the ladder towards the dominant meaning, although not above them (cf. pp 814–15).

What's the Difference?

Both Binder and Rayner and Martin et al. seem to agree that the "reordered-access model" awards a higher weight to meaning frequencies than to contextual fit. They also seem to agree that the "context-sensitive model" does the opposite, or perhaps that it gives equal weight to these two statistics.

I don't see why that's necessarily the case; as described above, the reordered-access model amounts to nothing more than a search strategy — in particular, it does not specify a scoring function.

On the other hand, the context-sensitive model seems to be an informal description of a Bayesian inference. As such, it is perfectly consistent with the greedy search strategy postulated by "reordered access model," or with any other search strategy you desire.

I thus find it a little difficult to get my pulse up over this discussion. As long as the two models are as mathematically underspecified as they currently are, it seems to me that any prediction could be consistent with, or inconsistent with, either model. If the competing parties really wanted to flesh out their claims about the relative weights of priors and likelihoods, they should start picking some numbers.

Another way of putting the same point is that if these researchers really thought that they postulated different weights on priors and likelihoods, then they should also be able to agree on a sentence for which the two models would predict not only different reading times, but also different interpretations.

If no such sentences exists, the difference must solely pertain to the search strategy, and the context-sensitive model does not seem to specify any particular algorithm for this purpose.

Simpson: "Context and the Processing of Ambiguous Words" (1994)

In his chapter of the 1994 edition of the Handbook of Psycholinguistics, Greg B. Simpson discusses a large number of ambiguity resolution studies. His conclusion from reviewing these studies is that it is not clear whether sentence-level comprehension can feed back into lexical recall or not.

After stating this pessimistic conclusion, he writes:
In an earlier review, (Simpson, 1984), I tried to argue that the constellation of results at that time could be explained by positing a system whereby all meanings are activated, but with the degree of activation being sensitive to influence by the relative frequencies of the meaning and by the context in which the ambiguous word occurs. There does not seem to be any compelling reason to change that position now. (p. 367)
Later in the article, he illustrates the need for methodological caution with a contrast between two studies, that of Paul et al. (1992) and that of Till et al. (1988).

Both of these studies are about priming, and both of them show that a sentence as a whole can prime for things that its individual components cannot.

The two examples that Simpson discusses (p. 368ff) are:
  • Paul et al.: The boy dropped the plant (primes spill)
  • Till et al.: The old man sat with his head own and did not hear a word of the sermon during mass (primes sleep)
That seems reasonable — in fact, the first time I read that the Till et al. sentence primes sleep, I was for a second completely sure that the word had actually occurred explicitly in the sentence.

So far the studies are consistent. They differ on their story about the time course of these priming effects, however: Paul et al. found that the holistic prime quickly decayed and had vanished after a delay of 500ms. Till et al., on the other hand, found that the holistic priming effect only appeared after "long intervals" (but Simpson doesn't write how long; p. 369).

Monday, July 15, 2013

Hart and Perfetti: "Learning Words in Zekkish" (2008)

Because of the interesting quotes and references from Gerard Steen, I decided to take a look at some of the papers at Charles Perfetti's homepage. He has generously put many of them online and even scanned some book chapters.

Perfetti is mainly a scholar of the cognitive and developmental psychology of reading, so not everything he writes can be directly translated into the debates about semantics that I'm interested in. However, this book chapter, written with Lesley Hart, discusses disambiguation along with resolution of other input ambiguities (including the resolution of phonetically ambiguous words like content).

The upshot of the discussion is that disambiguation is a horse race process: When a phonologically unambiguous but semantically ambiguous word is read, a number of candidate meanings are activated, and a winner is then found by mutual inhibition:
With ambiguous words there is no phonological competition; however, the context and the extent of bias in the context in which the word appears (Vu, Kellas, Petersen & Metcalf, 2003), word-specific qualities such as word structure (Almeida & Libben, 2005), and the relative frequency of use of each of the meanings (Collins, 2002) remains necessary for choosing among multiple activated lexical entries. (p. 112)
They also briefly mention the surprisingly heated debate around the question of when in the process contextually irrelevant meanings are killed off:
There is some debate as to the power of biasing contexts to speed the response time for subordinate meanings. For example, can reading "bank" in a sentence like "There were crocodiles sunning themselves on the bank" improve response time times to the river meaning of bank more than sentences like "We took a picture of the bank," for which either meaning of "bank" can be appropriate? Martin, Vu, Kellas, and Metcalf (1999) claim that strongly biasing context can override word meaning biases from frequency, whereas weakly biasing context cannot. Binder & Rayner disagree with the power of strongly biasing context; they find that strongly biasing context does not have enough strength to incase the activation rate of lower frequency word meanings. (p. 113)
As a side note, I'm not sure the first reference here is to the right paper: George Kellas and Hoang Vu wrote a similarly-titled paper which was a direct response to Katherine S. Binder and Keith Rayner's paper, so maybe that's what Hart and Perfetti in fact wanted to cite.