Harold R. Keables

This website is for those doing a postgraduate course in Applied Linguistics and TESOL. It is completely independent, and has no support or connections with any university.

Check out the Resources Section, which offers:

* Links to articles on all aspects of the MA.
* A Video section offering lectures by Dörnyei, Crystal, Nunan, Larson-Freeman, Krashen, Scott Thornbury (who??) and many others.
* Suggested useful blogs and web pages.
* Presentations

Academics work in universities. Their job is to teach and to do research. Most academics prefer research to teaching and are not taught how to teach. So, if you study in any good university you’ll be taught by experts who haven’t been taught how to teach. Nevertheless, if you’re a good student, you’ll get an excellent education. This leads to the suggestion that in tertiary education, teaching methodology matters little: it’s the student who counts. The students who go to the best universities are carefully selected, and a key criterion in the selection process is the student’s ability to study without spoon-feeding. A good student does her own studying and knows how to draw on the resources offered. When you sign up for a post-graduate course know that you are in charge and that you, and you alone, will determine the outcome. Your tutor is an expert, not, usually, a teacher. Your job is to use your tutor’s expertise, which means asking the right questions. Don’t ask “What should I do?”, or “Please suggest a topic”. Ask for comments on your own drafts, ask for guidance on reading; ask for clarification. Get into a dialogue with your tutor; shoot the breeze; get familiar; build a relationship, but remember: your tutor is your mentor in the Greek sense of the word, not your teacher.

A Closer Look at Task-Based Language Teaching. Part 1.

this way

In the last post, I gave a brief summary of Doughty and Long’s 10 “Methodological Principles”, which form the rationale for their particular approach to TBLT. In this and following posts I’d like to look a bit more closely at the Methodological Principles (MPs) in order to both clarify and evaluate what is being proposed. So, for example, when Doughty and Long say

“Building lessons around texts (as in most content-based language teaching) means studying language as object, not learning language as a living entity through using it and experiencing its use during task completion. Learners need to learn how to do a task themselves”,

one needs to ask what’s wrong with “studying language as object” and, if task-completion is the aim, why don’t students just do it in the L1? I’m being glib, of course, but there are serious issues at stake here. We need to look closely at questions such as what’s wrong with studying a text; why is doing tasks the best way to learn?; why is negative feedback better than positive feedback?; why is an “analytic” syllabus better than a “synthetic” syllabus?; what is elaborated input?; how does one judge the two key constructs of task complexity and task difficulty?

But, in recognition of Russ Mayne’s urgent need to know about the so-called order of acquisition in SLA, I want to start by looking at Doughty and Long’s MP8: Respect Developmental Processes and “Learner Syllabuses”. Recall that in their 2003 article the authors say

“There is strong evidence for various kinds of developmental sequences and stages in interlanguage development, such as the well known four-stage sequence for ESL negation (Pica, 1983; Schumann, 1979), the six-stage sequence for English relative clauses (Doughty, 1991; Eckman, Bell, & Nelson, 1988; Gass, 1982), and sequences in many other grammatical domains in a variety of L2s (Johnston, 1985, 1997). The sequences are impervious to instruction, in the sense that it is impossible to alter stage order or to make learners skip stages altogether (e.g., R. Ellis, 1989; Lightbown, 1983). Acquisition sequences do not reflect instructional sequences, and teachability is constrained by learnability (Pienemann, 1984). The idea that what you teach is what they learn, and when you teach it is when they learn it, is not just simplistic, but wrong………………. The question, then, is how to harmonize instruction with the learner’s internal syllabus, with so-called “natural” developmental processes.”

Let’s take a look at the “strong evidence” referred to. We may begin with Pit Corder and error analysis.

Pit Corder: Error Analysis

Corder (1967) argued that errors were neither random nor systematic results of L1 transfer; they were indications of learners’ attempts to figure out an underlying rule-governed system. Corder distinguished between errors and mistakes: mistakes are slips of the tongue, whereas errors are indications of an as yet non-native-like, but nevertheless, systematic, rule-based grammar. Interesting and provocative as this was, error analysis failed to capture the full picture of a learner’s linguistic behaviour. Schachter (1974) compared the compositions of Persian, Arabic, Chinese and Japanese learners of English, focusing on their use of relative clauses. She found that the Persian and Arabic speakers had a far greater number of errors, but she went on to look at the total production of relative clauses and found that the Chinese and Japanese students produced only half as many relative clauses as did the Persian and Arabic students. Schachter then looked at the students’ L1 and found that Persian and Arabic relative clauses are similar to English in that the relative clause is placed after the noun it modifies, whereas in Chinese and Japanese the relative clause comes before the noun. She concluded that Chinese and Japanese speakers of English use relative clauses cautiously but accurately because of the distance between the way their L1 and the L2 (English) form relative clauses. So, it seems, things are not so straightforward: one needs to look at what learners get right as well as what they get wrong.

The Morpheme Studies

Next came the morpheme order studies. Dulay and Burt (1975) claimed that fewer than 5% of errors were due to native language interference, and that errors were, as Corder suggested, in some sense systematic, that there was something akin to a Language Acquisition Device at work not just in first language acquisition, but also in SLA. The morpheme studies of Brown in L1 (1973) led to studies in L2 by Dulay & Burt (1973, 1974a, 1974b, 1975), and Bailey, Madden & Krashen (1974), all of which suggested that there was a natural order in the acquisition of English morphemes, regardless of L1. This became known as the L1 = L2 Hypothesis, and further studies (by Ravem (1974), Cazden, Cancino, Rosansky & Schumann (1975), Hakuta (1976), and Wode (1978) all pointed to systematic staged development in SLA.

Some of these studies, particularly those of Dulay and Burt, and of Bailey, Madden and Krashen, were soon challenged. But, as Larsen Freeman and Long (1991) point out, since the original studies, over fifty L2 morpheme studies were carried out using more sophisticated data collection and analysis procedures and the results of these studies went some way to restoring confidence in the earlier findings.


Selinker’s Interlanguage.

The third big step was Selinker’s (1972) paper, which argues that the L2 learners have their own autonomous mental grammar (which came to be known as interlanguage grammar), a grammatical system with its own internal organising principles. One of the first stages of this interlanguage to be identified was that for ESL questions. In a study of six Spanish students over a 10-month period, Cazden, Cancino, Rosansky and Schumann (1975) found that the subjects produced interrogative forms in a predictable sequence:

1. Rising intonation (e.g., He works today?),
2. Uninverted WH (e.g., What he (is) saying?),
3. “Overinversion” (e.g., “Do you know where is it?),
4. Differentiation (e.g., “Does she like where she lives?).

Then there was Pica’s study of 1983 which suggested that learners from a variety of different L1 backgrounds go through the same four stages in acquiring English negation:

1. External (e.g., No this one./No you playing here),
2. Internal, pre-verbal (e.g., Juana no/don’t have job),
3. Auxiliary + negative (e.g., I can’t play the guitar),
4. Analysed don’t (e.g., She doesn’t drink alcohol.)

Apart from these two examples, we may cite the six-stage sequence for English relative clauses (see Doughty, 1991 for a summary) and sequences in many other grammatical domains in a variety of L2s (see Johnston, 1997).

Pienemann’s 5-stage Sequence.

Perhaps the most extensive and best-known work in this area has been done by Pienemann whose work on a Processability Theory started out as the Multidimensional Model, formulated by the ZISA group mainly at the University of Hamburg in the late seventies. One of the first findings of the group was that all the children and adult learners of German as a second language in the study adhered to the five-stage developmental sequence shown below:

Stage X – Canonical order (SVO)
die kinder spielen mim bait //// the children play with the ball
(Romance learners’ initial SVO hypothesis for GSL WO is correct in most German sentences with simple verbs.)

Stage X + I- Adverb preposing (ADV)
da kinder spielen //// there children play
(Since German has a verb-second rule, requiring subject—verb inversion following a preposed adverb {there play children), all sentences of this form are deviant. The verb-second (or ‘inversion’) rule is only acquired at stage X + 3, however. The adverb-preposing rule itself is optional.)

Stage X + 2- Verb separation (SEP)
alle kinder muss die pause machen //// all children must the break have
(Verb separation is obligatory in standard German.)

Stage X+3- Inversion (INV)
dam hat sie wieder die knock gebringt //// then has she again the bone brought
(Subject and inflected verb forms must be inverted after preposing of elements.)

Stage X+4- Verb-end (V-END)
er sagte, dass er nach house kommt //// he said that he home comes
(In subordinate clauses, the finite verb moves to final position.)
Learners did not abandon one interlanguage rule for the next as they progressed; they added new ones while retaining the old, and thus the presence of one rule implies the presence of earlier rules.

A few words about the evidence. There is the issue of what it means to say that a structure has been acquired, and I’ll just mention three objections that have been raised. In the L1 acquisition of morphemes, a structure was assumed to be acquired when it occurred three times in a row in an obligatory context at a rate of 90%. The problem with such a measurement is, first, how one defines an “obligatory” context, and second, that by only dealing with obligatory contexts, it fails to look at how the morphemes might occur in incorrect contexts. The second example is that Pienemann takes acquisition of a structure as the point at which it emerges in the interlanguage, its first “non-imitative use”, which many say is hard to operationalise. A third example is this: in work reported by Johnson, statistical measures using an experimental group of L2 learners and a control group of native speakers have been used where the performance of both groups are measured, and if the L2 group performance is not significantly different from the control group, then the L2 group can be said to have acquired the structure under examination. Again, one might well question this measure.


To return to developmental sequences, by the end of the ninetees, the acquisition of morphemes and of various structures – negation, questions, word order, embedded clauses and pronouns being the most important areas – had been sufficiently studied to lend very persuasive support to the view that L2 learners follow a fairly rigid developmental route. Moreover, it was seen that this developmental route sometimes bore little resemblance to either the L1 of the learner, or the L2 being learnt. For example, Hernández-Chávez (1972) showed that although the plural is realised in almost exactly the same way in Spanish and in English, Spanish children learning English still went through a phase of omitting plural marking. It had been assumed prior to this that second language learners’ productions were a mixture of both L1 and L2, with the L1 either helping or hindering the process depending on whether structures are similar or different in the two languages. This was clearly shown not to be the case. All of which was taken to suggest that SLA involves the development of interlanguages in learners, and that these interlanguages are linguistic systems in their own right, with their own sets of rules.

The big question is of course,: What do we make of this evidence? To take a well-known example, Krashen’s Natural Order hypothesis states that

“The rules of language are acquired in a predictable way, some rules coming early and others late. The order is not determined solely by formal simplicity, and it is independent of the order in which rules are taught in language classes”.

Such a claim has as a consequence that classroom ELT should give L2 learners, regardless of age, the same kind of environment that children enjoy when they acquire their L1. You learn the L2 the same way as you learn the L1 – by picking it up through use. It follows from the Natural Order that most learning is implicit and that teachers are wasting their time when they base their instruction on grammar teaching and / or the use of coursebooks which sequence material in terms of going from simple to complex grammar rules. Pienemann’s Teachability hypothesis articulates Krashen’s hypothesis explicitly:

“An L2 structure can be learnt from instruction only if the learner’s interlanguage is close to the point when this structure is acquired in a natural setting”.

In other words, there’s no point teaching the third person present form to early learners since they’re not ready to acquire it. Or, as we saw above, in Doughty and Long’s words

“The idea that what you teach is what they learn, and when you teach it is when they learn it, is not just simplistic, but wrong”.


“Well, that’s great!”, teachers might say, “You’ve just done us out of a job!”. They have a point. While all this says what teachers shouldn’t do, it’s not very good when it comes to helpful classroom teaching tips. We still know very little about the stages that L2 learners are said to go through, and, concerning the bits we do know something about, all those working on acquisition sequences agree that the sequences are neither linear nor uniform. The dynamic nature of SLA means that differentiating between different stages is difficult, the stages overlap, and there are variations within stages. Thus the simplistic view of a “Natural Order” where a learner starts from Structure 1 and reaches, let’s say, Structure 549 is absurd. Imagine trying to organise stages such as those identified by Pienemann into ordered sets. As Gregg (1984) points out:

“If the structures of English are divided into varying numbers of ordered sets, the number of sets varying according to the individual, then it makes little sense to talk about a ‘natural order’. If the number of sets varies from individual to individual; then the membership of any given set will also vary, which makes it very difficult to compare individuals, especially since the content of these sets is virtually completely unknown”.

So the “fact” that it is impossible to alter the order of acquisition in certain studied cases, or to make learners skip one of these stages, while implying that “teachability is constrained by learnability”, doesn’t actually help teachers much. In syllabus design, for example, we can’t design a syllabus which coincides with the natural order because nobody knows what this natural order is.

Doughty and Long say that the only way one can respect the learners’ internal syllabus and their developmental sequence is “by employing an analytic, not synthetic, syllabus, thereby avoiding futile attempts to impose an external linguistic syllabus on learners (e.g., the third conditional because it is the third Wednesday in November), and instead, providing input that is at least roughly tuned to learners’ current processing capacity by virtue of having been negotiated by them during collaborative work on pedagogic tasks”. It seems then, that in order to teach effectively, we need to concentrate more on facilitating implicit learning than on explicit teaching, give more carefully-tuned input, and abandon synthetic syllabuses in favour of an analytic one. But how much more implicit learning is required, and how little explicit teaching? How do we tune input? And what is the difference between analytic and synthetic syllabuses? All this, plus what Swan really said to Widdowson, in Part 2.

For References, see the list at the end of Doughty and Long (2003) “Optimal Psycholinguistic Environments for Distance Foreign Language Learning” here: http://llt.msu.edu/vol7num3/doughty/default.html

Task-Based Language Teaching

I’ve recently suggested that one of the weaknesses of Michael Lewis’ “Lexical Approach” is that it provides no clear account of how it should be implemented in a syllabus. As a kind of antidote, I here present a very brief summary of an article by Doughty and Long . The article outlines how their “Task-Based Language Teaching” (TBLT) approach might be used for distance learning, but I’ve ignored that part of the article and concentrated on the methodological principles which inform TBLT. I should explain that this post was triggered by watching Mike Griffin interview Scott Thornbury at the KOTESOL conference last week. Scott mentioned Mike Long’s plenary where Long talked about the 10 methodological principles which lie behind TBLT, and which Scott reckons Dogme is also faithful to. I agree that Dogme does broadly follow TBLT principles, most importantly perhaps in being process-driven and in respecting “Learner Syllabuses”. So, below, the summary.

Doughty and Long start by saying that Task-Based Language Teaching constitutes a coherent, theoretically motivated approach to all six components of the design, implementation, and evaluation of a genuinely task-based language teaching program: (a) needs and means analysis, (b) syllabus design, (c) materials design, (d) methodology and pedagogy, (e) testing, and (f) evaluation. I should say that I haven’t gone into all six: needs analysis, for example, which is absolutely vital to TBLT, is not discussed. A distinction is made in TBLT between methodological principles (MPs) and pedagogic procedures. Methodological principles (MPs) are putatively universally desirable instructional design features, motivated by theory and research findings in SLA, educational psychology, and elsewhere. Whereas these MPs are language teaching universals, pedagogic procedures comprise the potentially infinite range of local options for realizing the principles at the classroom level. By way of illustration, Doughty and Long cite MP7: “Provide negative feedback,” which has the status of a methodological principle in TBLT. But how that feedback is best provided in any particular classroom is a matter of local circumstance. Options range from overt and explicit procedures (e.g., use of a rule or explanation delivered in oral, manual, or written mode, in the L1 or L2, or repetition of the correct response), through less intrusive ones (e.g., teacher “clarification requests”) to covert and implicit ones (e.g., manipulation of input frequency to increase perceptual salience, or the use of corrective recasts, of which students, and even teachers themselves, may sometimes barely be aware).

MP1: Use Task, Not Text, as the Unit of Analysis

Doughty and Long make clear that the focus in TBLT lessons is on task completion, not study of a decontextualized linguistic structure or a list of vocabulary items, or a text. Spoken or written texts, they insist, are static records of someone else’s task accomplishment. Building lessons around texts means studying “language as object”, not learning language as a living entity through using it and experiencing its use during task completion. Learners need to learn how to do a task themselves. Doughty and Long contrast two classroom activities: learning to make a particular kind of social, business, or emergency medical telephone call through acting one out, as in a role play and/or making a real one to given specifications, on the one hand, and on the other, in a text-based activity where the learners listen to or read a “dead” script of someone else’s effort.

One problem which arises when task is selected as the unit of analysis is sequencing of course material. This problem is hardly even addressed in most materials, being left to “some intuition-based and question-begging notion of linguistic complexity (e.g., teach the “simplest” structures first)”. In contrast Doughty and Long believe that “the ultimate solution, which is an important component of TBLT, will lie in the development of series of pedagogic tasks sequenced in terms of (inherent, unchanging, and objectively measurable) task complexity, with task difficulty (which varies for specific learners according to such factors as their L2 proficiency) modifiable as needed by alterations to task conditions (the circumstances under which the tasks are carried out). By working through the series of pedagogic tasks, learners can build up the abilities needed eventually to perform the target tasks identified by the learner needs analysis at the levels required”.

MP2: Promote Learning by Doing

The basic idea is that practical hands-on experience with real-world tasks brings abstract concepts and theories to life and makes them more understandable. New knowledge is better integrated into long-term memory and more easily retrieved if tied to real-world events and activities. TBLT is an example of learning by doing at several levels. It aims to equip learners to meet their present or future real-world communicative needs, as identified through a task-based learner needs analysis, the first step in course design. Then, inside the classroom, instead of studying the new language as object in order to use it to communicate at some later date, students learn language through doing pedagogic tasks. Pedagogic tasks combine language learning and action at various levels. Almost all pedagogic tasks have a hands-on, problem-solving quality designed to arouse learners’ interest and hold their attention.

MP3: Elaborate Input

Doughty and Long insist that both genuine and simplified texts are psycholinguistically inappropriate for learners. Genuine (popularly know as “authentic”) texts, originally written by and for native speakers, are usually too complex for all but very advanced learners. As a result, they typically require explicit metalinguistic study to render them comprehensible, which leads, in turn, to the study of language as object rather than development of a functional ability to use language. The traditional language teaching alternative, simplified texts, are unnatural and unrealistic in their tendency to be self-contained, with little or none of the usual implicitness, open-endedness, and intertextuality that characterizes authentic discourse. Also, while s implified texts are (in most cases) easier to understand than genuine texts, the improved comprehensibility comes at the cost of much of their value for language learning. How are learners to acquire items that have been removed from the input, and how are they to learn real NS use of new items if presented with something far less and unrepresentative?

The alternative to genuine and simplified texts recommended is elaborated input. “Elaboration is the term given to the myriad ways NSs modify discourse (i.e., in language use for non-native speakers [NNSs] to make meaning comprehensible, as revealed by studies of foreigner talk discourse). Most of the modifications occur during negotiation for meaning (i.e., when NS and NNS are focused on achieving communication while working cooperatively on a task). They include partial and complete, exact and semantic, self- and other-repetition; confirmation checks, comprehension checks, and clarification requests; rearrangement of utterances so that order of events and order of mention are iconic; paraphrase; lexical switch; decomposition; a preference for intonation and yes/no questions over WH questions; use of redundancy of various kinds; and many other “scaffolding” devices”. Elaborated input can be provided in advance (e.g., in the pre-scripted materials sources for pedagogic tasks), but also occurs naturally in teacher speech and in learner-learner discourse, as long as participants are focused on task completion and, therefore, on communication.

MP4: Provide Rich Input

Linguistically simplified input, which goes hand in hand with synthetic (especially structural, or grammatical) syllabuses, also tends to be impoverished input. Controlling grammar, vocabulary and sentence length results, intentionally and by definition, in a more limited source of target-language use upon which learners must rely in order to learn the code. The often tiny samples are worked and reworked in class, whether practiced until rote-memorized, milked meta-linguistically, or both, and learners are expected to learn the full language on the basis of access to such limited data. Elaborated texts go a long way towards remedying the situation. They alone are insufficient, however. Adult foreign language learners require not just linguistically complex input, but rich input (i.e., realistic samples of discourse use surrounding NS and NS-NNS accomplishment of target tasks).

This will usually mean task-specific and domain-specific target-language use not typically found in commercially published language teaching materials, not even those allegedly designed for language-for-specific-purposes programs. Commercial materials writers and publishers generally aim for the least context-, domain-, and task-specific texts possible, in order to boost the potential market for a book. This is the opposite of what is needed, especially if advanced, functional proficiency is the goal. Numerous studies have shown large discrepancies between the models presented in “general” textbooks and genuine NS use on real tasks in particular domains, even when those domains are relatively ordinary and “non-technical”. Learners need (a) elaborated texts, (b) plenty of them, (c) texts derived from a far greater range of target tasks and discourse domains than is currently typical in commercial language teaching materials, and most important of all, (d) texts motivated by tasks of the specific kinds a needs analysis has shown to be relevant. The examples will usually need to be based upon “field work” of various sorts by course designers (e.g., in situ audio or video recordings of NSs performing target tasks, and the gathering of authentic written documents relevant to those target tasks). Rich input, in sum, is not just a matter of linguistic complexity, but of quality, quantity, variety, genuineness, and relevance.

MP5: Encourage Inductive (“Chunk”) Learning

“If adult foreign language learners are to sound like natives, they need to be exposed to realistic (genuine or elaborated) samples of target language use, for example , as input components of pedagogic tasks, and then helped to incorporate, store and retrieve whole chunks of that input as whole chunks“.

I suggest reading the article to get the authors’ point here, but basically they propose that adults will abstract the language chunks that they need during the course of learning to perform the task at hand. They cite various studies on implicit learning of complex systems (Berry, 1997; Berry & Dienes, 1993; and Stadler & Frensch, 1998) where subjects are given input values and are told to arrive at particular output values by attempting to manage the system through trial and error. They are provided no information whatsoever concerning the underlying structure of the system, but are usually given feedback as to the effect of their input to the system. Some examples of complex systems often cited are the management of a sugar factory (variables are workers and amount of production), city traffic flow management (variables are bus schedules and parking lot fees), and interaction with a computer “personality” (where the computer person’s mood is dependent upon the input from the subject). The basic and consistent finding of this research is that subjects become highly skilled at managing complex systems long before they are able to explain the rules underlying those systems. Given enough time, they can be made to verbalize the rules that guide their own performance, but the ability to express the rules always develops after that, and, crucially, is not necessary for the improvements in performance evidenced in doing the tasks (see Doughty, 2003, for further discussion). “To date most published materials promote the explicit analysis of foreign languages. This approach results in declarative knowledge when what is needed is the development of language ability that is deployable during spontaneous interaction”.

MP6: Focus on Form

This is Long’s most well-known contribution to pedagogy. Given that research has shown that a focus on meaning alone is insufficient to achieve full native-like competence, (after as much as 12 years of classroom immersion, Canadian French immersion students’ productive skills remain far from native-like, particularly with respect to grammatical competence (Lapkin, Hart, & Swain, 1991)), we can conclude that comprehensible L2 input is necessary, but not sufficient. “A focus on meaning, moreover, can be improved upon, in terms of both rate and ultimate attainment, by periodic attention to language as object (Long, 1988). This is best achieved not by a return to discrete-point grammar teaching, or focus on forms, where students spend much of their time working on isolated linguistic structures in a sequence predetermined externally and imposed on them by a syllabus designer or textbook writer, in conflict with the learner’s internal syllabus. Rather, during an otherwise meaning-focused lesson, and using a variety of pedagogic procedures, learners’ attention is briefly shifted to linguistic code features, in context, to induce “noticing” (Schmidt, 1990, and elsewhere), when students experience problems as they work on communicative tasks (i.e., in a sequence determined by their own internal syllabuses, current processing capacity, and learnability constraints). This is called focus on form (Doughty & Williams, 1998a; Long, 1988, 1991, 1997, 2000a; Long & Robinson, 1998)”.

Examples of focus-on-form techniques, ranging from less to more explicit, include (a) input flood, where texts are saturated with L2 models; (b) input elaboration, as described in MP2; (c) input enhancement, where learner attention is drawn to the target through visual highlighting or auditory stress; (d) corrective feedback on error, such as recasting; and (e) input processing, where learners are given practice in using L2 rather than L1 cues. The most difficult practical aspect of focus on form is that, to be psycholinguistically relevant, it should be employed only when a learner need arises, thus presenting a difficulty for the novice teacher, who may not have relevant materials to provide. Where face-to-face interaction is the norm, as in L2 classrooms, recasting is an obvious potential pedagogical procedure. Once an L2 problem has been diagnosed for a learner, then pedagogical procedures may be decided upon and materials developed for use when the need next arises.

MP7: Provide Negative Feedback

While argument persists as to the necessity of negative evidence in language learning, recent work on both traditional explicit teacher “error correction” and implicit negative feedback in the form of corrective recasts (see, e.g., DeKeyser, 1993; Long, 2004) suggests strongly that negative feedback can be facilitative, at the very least, with certain classes of L2 structures.

Since the value of negative feedback lies in drawing learner attention to some problematic aspect of their interlanguage (i.e., inducing “noticing,” Schmidt, 2001), then the timing of that feedback is critical. Where corrective recasts are concerned, the information must be provided within some as-yet-little-understood cognitive processing window (for instance, but not necessarily, in working memory), such that learners can make some sort of comparison between the information provided in the feedback and their own preceding utterance (Doughty, 2001a). Recasts are proposed as an ideal (but not the only) form of negative feedback in TBLT for some classes of grammatical and lexical problems, at least, because they are not intrusive on the processing of meaning during task accomplishment and do not depend upon metalinguistic discussion of a language problem. Recasts are pervasive in child-adult discourse and in L2 classroom discourse. The psycholinguistic mechanism by which they are believed to work depends upon the juxtaposition of the learner utterance and the recast. It is claimed that learners have sufficient working memory to hold both utterances, thereby enabling the comparison to take place.

MP8: Respect Developmental Processes and “Learner Syllabuses”

Doughty and Long point to the strong evidence for various kinds of developmental sequences and stages in interlanguage development, such as the well known four-stage sequence for ESL negation (Pica, 1983; Schumann, 1979), the six-stage sequence for English relative clauses (Doughty, 1991; Eckman, Bell, & Nelson, 1988; Gass, 1982), and sequences in many other grammatical domains in a variety of L2s (Johnston, 1985, 1997). The sequences are impervious to instruction, in the sense that it is impossible to alter stage order or to make learners skip stages altogether (e.g., R. Ellis, 1989; Lightbown, 1983). Acquisition sequences do not reflect instructional sequences, and teachability is constrained by learnability (Pienemann, 1984). The idea that what you teach is what they learn, and when you teach it is when they learn it, is not just simplistic, but wrong.

Equally well attested are the beneficial effects of instruction in such areas as accelerating passage through the sequences and extending the scope of application of grammatical rules (Pienemann, 1989), in dealing with areas of the L2 grammar supposedly unlearnable from positive evidence alone (White, 1991), and in generally improving accuracy, rate of learning, and level of ultimate attainment (Doughty, 2003; Long, 1988). The question, then, is how to harmonize instruction with the learner’s internal syllabus, with so-called “natural” developmental processes. TBLT does this in a variety of ways, first and foremost by employing an analytic, not synthetic, syllabus, thereby avoiding futile attempts to impose an external linguistic syllabus on learners (e.g., the third conditional because it is the third Wednesday in November), and instead, providing input that is at least roughly tuned to learners’ current processing capacity by virtue of having been negotiated by them during collaborative work on pedagogic tasks. The learner syllabus is also respected through use of (by definition, reactive) focus on form and a preference for recasts where the results are comparable with more overt forms of “error correction” , as their use implies learner direction to at least some classroom communication. In other words, not only in that course content is determined by student needs, but also in this psycholinguistic sense, TBLT is radically learner-centered. Universal developmental processes and the learner’s internal syllabus are clearly and consciously allowed to guide and mediate instruction.

MP9: Promote Co-Operative/Collaborative Learning

Research findings in both child L1A (Ochs & Schieffelin, 1979) and child and adult L2A (Gass, 2003; Hatch, 1978; Long, 1983) reveal a facilitative role in language development for collaborative, “scaffolded” discourse across utterances and speakers. Research in general education (e.g., Barnes, 1976; Holt, 1993; Webb, 1991) has documented the positive effects of co-operative, collaborative group work on attainment. Research on cooperative learning and small group work in second language learning provides similar findings (Jacobs, 1998; Liang, Mohan, & Early, 1998; Long & Porter, 1985; Oxford, 1997; Pica et al, 1996).

MP10: Individualize Instruction

Work by numerous scholars in general education and in foreign language classrooms has long shown the benefits of tailoring instruction to cater to individual differences in goals, interests, motivation, cognitive style, and learning strategies (Altman & James, 1980; Harlow, 1987; Logan, 1973; Sawyer & Ranta, 2001; Wesche, 1981). Improvements in the measurement of these and other individual difference variables, such as language learning aptitude and short-term memory (see, e.g., Ehrman & Leaver, 2001; N. Ellis, 2001; Grigorenko, Sternberg, & Ehrman, 2000; Miyake & Friedman, 2001), further justify the individualization of instruction in any language teaching program. In TBLT, individualization occurs in the selection of syllabus content, in respect for individual internal syllabuses, and in modifications of the pace at which and manner in which instruction is delivered, as suggested by diagnostic information gathered on individual differences.


I hope this has given those who only got Mike Long’s quick run through TBLT in his KOTESOL plenary a bit of additional information. I hope it also indicates the areas which any well-developed approach to ELT should cover. Whatever one think about Doughty and Long’s account, one must, I think, acknowledge that it is well-argued, thorough, and well-supported by research findings. It is also, of course, very critical of synthetic, product-orientated syllabuses and the materials they use, making it clear that such methods and materials fly in the face of research findings.

Please refer to the original article (link in red at start of this post) for references.

Lexical Priming and the Competition Model

This is an attempt to answer questions NickW 211 asked in a comment on my previous post: “Hugh Dellar and Lexical Priming Part 2″. First, thanks Nick for taking the time to share your thoughts.

Question 1: You have criticized Lexical Priming, saying that “the problem is that Hoey nowhere operationalises his term “noticing” in any way that allows us to test the theory.” Do you happen to know if this same criticism of Hoey also applies to the theories of others with broadly similar theories of language? e.g. Sinclair or Hunston and Francis?

My Answer: No, I don’t think it does. None of them, as far as I know, claims that sub-conscious noticing is the key construct in language learning.

Question 2. You’ve also noted that Chomsky has successfully argued that: “language use is “stimulus independent” and “historically unbound”. How has Chomsky tested these two notions of stimulus independence and historical unboundedness? I was under the impression that these had not been tested, or at least that if they had been tested they were done so under highly specialized conditions that bear little or no relation to language as it is used by 99.9% of speakers.

My Answer: Skinner’s (1957) theory claims that “Language is stimulus dependent”. All the examples of language use which are not stimulus dependent therefore combine to form a convincing body of evidence that Skinner’s theory is false. Similarly, the claim that language is historically bound is refuted by examples of new, creative language use. So Chomsky’s claims that language use is “stimulus independent” and “historically unbound”, are part of his (1959) refutation of Skinner’s theory of language. In the 1950s, Skinner’s behaviourism was the paradigm theory of learning not only language but everything else, and claimed that any instance of human behaviour can be explained as a response to a stimulus. This view is based on a strict empiricist epistemology which regards all talk of mental states as so much unscientific mumbo jumbo. Chomsky was responsible for the fastest and most dramatic paradigm shift of his time, ushering in the new “mentalist” or “nativist” paradigm for linguistics, and cognitive science for psychology.

In a way, you’re right to say that Chomsky’s theory has nothing to do with the language used by 99.9% of speakers. Chomsky’s model of language distinguishes between competence and performance, between language knowledge and the use of language, influenced as the latter is by limits in the availability of computational resources, stress, tiredness, alcohol, etc. Chomsky says he’s concerned with “the rules that specify the well-formed strings of minimal syntactically functioning units” and with “an ideal speaker-listener, in a completely homogenous speech-community, who knows his language perfectly and is unaffected by such grammatically irrelevant conditions as memory limitations, distractions, shifts of attention and interest, and errors (random or characteristic) in applying his knowledge of the language in actual performance” (Chomsky, 1965: 3).

Only by dealing with the abstract knowledge he so carefully-defines can Chomsky make the claims he does for his theory of UG, and this is its great strength as a theory. To put it another way, Chomsky is interested in I-Language, not E-Language. I-Language refers to internalised language, the linguistic knowledge in the mind of the speaker. E-language refers to linguistic output: shouts, poems, sentences, songs, texts of all description. The important thing is that, in Chomsky’s view, E-Language is epiphenomenal; it is the result of I-Language. Chomsky sees I-Language as the phenomenon and E-language as performance data. All good theories are based on a careful distinction between phenomena and data. When you look at Hoey’s theory, it tries to base a case on patterns detected in corpora – the bigger the better (i.e., the more raw data, the better). Patterns are the phenomena which we notice in the data, but what’s the explanation for these patterns?


Question 3. How does Chomsky then account for the fact that, while it may be possible to “say things that we have not been trained to say and that we have never heard anybody else say” we nevertheless only seldom do so?

My Answer: I don’t think it’s the case that we rarely say things we have not been trained to say and that we have never heard anybody else say; I think it happens all the time. I agree that novel utterances can be seen as reformulations, hybrids, syncretisms of previous utterances, and that these novel utterances will contain patterns of text of the type Hoey goes on about. But the list of variations on the patterns is literally countless and as an explanation of language use, the construct of “lexical chunks” amounts to very little indeed. I think Hoey is right to say that the grammatical category we assign to a word is “a convenient label we give the combination of (some of) the word’s most characteristic and genre-independent primings”, but he seems to miss the point of the categories, which is to act as organising principles. If we abandon these categories we’re left with a mess: “a cluster of other primings”!

You say Hoey’s theory of Lexical Priming seems to point toward an explanation of why each of the following ‘novel’ utterances would be more or less acceptable and appear to carry some kind of meaning (even if it is one we can’t quite access):

Colorless green ideas sleep furiously.

a) The greening of colorlessly furious sleep ideas.
b) To colorless ideas with furious sleeping.
c) Furiously greening colorless idea sleeps.

I’m afraid I think these examples, far from supporting Hoey’s theory of Lexical Priming, actually give strong support to Chomsky’s claim about grammaticality judgements!

To return to lexical priming, it isn’t clear to me how sub-consciously noticing lexical patterns in input “explains” anything. Unlike Chomsky’s theory which says that we are hard-wired with knowledge about the principles underlying language, so that learning a particular language is a question of setting parameters as a result of exposure to it and then subsequently acquiring more lexis (including lexical chunks), Hoey limits himself to describing a small selection of the countless number of connections we make between words in terms of collocation, semantic association, pragmatic association, colligation, etc.; saying that repeated exposure to naturally occurring data is the sufficient condition for language learning; and claiming that what lexical patterns you acquire depends on the frequency with which you are exposed to them. You suggest that the child, as it learns language, is able to transfer the knowledge of primings for the words it does know onto those that it does not, or that it only knows partially. The syntax is not there from the outset but emerges as the result of the growth in sophistication that comes as more and more primings cluster about the ‘word’ – which here seems to be a shorthand for a place-marker over a ‘space’ in the linguistic system. So, instead of Chomsky’s claim that the child starts out with a knowledge of certain underlying principles of grammar which are common to all languages, we substitute the claim that the child starts with “a capacity for the acquisition of primings”. The problem is that it doesn’t make sense to say that we acquire primings. Priming means something like “readied by our prior experience” which is a mental state, not something which is acquired. Lexical priming means readied by our prior experience of words to expect words to be in the company of other words (their collocations); to appear in certain grammatical situations (colligations); to be in certain positions in text and discourse (textual colligations) and on and on. In my opinion Hoey doesn’t describe the competence acquired or explain how it is acquired. The latter question is obviously the key to an explanation, and Hoey has to either explain what the mechanism is or revert to an empiricist framework where prior experience is seen as a sufficient explanation and any black box is dispensed with. Until now, there is no convincing explanation of how the language Hoey describes is acquired, and no answer to the the fundamental question of the poverty of the stimulus is provided.

Question 4 Is there any possibility the order of acquisition of grammatical functors may be influenced by: Learners beliefs about what a language is and, therefore, how to learn it?

My Answer: Early work on error analysis, followed by 2 phases of morpheme studies, together with work on studies of acquisition order, indicate that there’s a more or less fixed order in the acquisition of certain aspects of English as an L2. That this order is influenced by learners’ beliefs and attitudes towards the target language is certainly possible, but, first, it’s been very difficult for researchers to present a construct of a belief or an attitude whose effects can be clearly measured, and second, these beliefs and attitudes are claimed to influence rate but not route. The evidence suggests that believing that “accuracy is more important than fluency”, for example, might affect how quickly or slowly you get accurate or fluent, but it doesn’t affect the order of acquisition of certain aspects of the language.

Question 5 Would you be prepared to concede that there might be several plausible alternative explanations for the order of L2 acquisition other than one purely or at least mainly related to cognitive processes – which as far as I know are unobservable to researchers?

My Answer: First, cognitive processes are not entirely unobservable these days, but it’s certainly the case that “interlanguage” is an entirely unobservable theoretical construct, and none the worse for that, IMHO. Like gravity, interlanguage is posited to exist in order to explain something we want to understand. But, yes, of course I accept that other explanations are possible. It’s just that if one subjects all the current candidate theories to critical examination, I think some version of a processing theory, which sees SLA as a process by which attention-demanding, controlled processes become more automatic through practice, and which results in the restructuring of the existing mental representation, is the strongest theory to date. Note that such a processing theory relies to some extent on the acquisition of grammatical knowledge, although it can easily cope with the suggestion that, what is being acquired is not Chomsky’s linguistic competence, but rather communicative competence, which, as I suggested in the previous post, may be something like Widdowson’s description of it as “a matter of knowing a stock of partially pre-assembled patterns, formulaic frameworks, and a kit of rules, so to speak, and being able to apply the rules to make whatever adjustments are necessary according to contextual demands. Communicative competence is a matter of adaption, and rules are not generative but regulative and subservient”.

The Competition Model

I think you might be interested in the Competition Model, which incorporates some of the ideas you touch on, and fleshes out many missing parts of Hoey’s theory.

Bates and MacWhinney’s Competition Model, first outlined in 1982, challenges the two fundamental bases on which processing theories rest: innateness, and a formalist approach to language. In contrast to Chomsky’s Principles and Parameters model, the Competition Model sees language learning as non-modular and non-specific, i.e. it results from the same kinds of cognitive mechanisms as those involved in other kinds of learning. Also in contrast to Chomsky, Bates and MacWhinney do not separate the linguistic form of language from its function; they argue that the two are inseparable. As a result of their rejection of both innateness and formalism, the third difference between the Competition Model and Chomsky’s theory of UG is that while Chomsky offers a theory of competence, Bates and MacWhinney offer a theory of performance. The Competition Model is concerned with how language is used, and while it is certainly true that this is also the main interest for other psycholinguistic approaches to SLA, the difference is that instead of adopting the formalist approach to language as a given, the Competition Model, by adopting a particular version of the functional approach to linguistics, considers language to be constructed through use.

MacWhinney (1997: 114) explains that the Competition Model makes a commitment to four major theoretical issues. These are:

(i) Lexical Functionalism. Functionalism claims that the forms of language are determined by the communicative functions they perform; language is a set of mappings between forms and functions. “Forms are the external phonological and word order patterns that are used in words and syntactic constructions. Functions are the communicative intentions or meanings that underlie language usage” (MacWhinney, 1997:115).

(ii) Connectionism. The Competition Model uses connectionist models to model the interactions between lexical mappings. Connectionism rejects the assumption made by nativists that the brain is a symbol processing device similar to a digital computer, and argues that the brain relies on a type of computation that emphasises patterns of connectivity and activation. MacWhinney, in keeping with the empiricist approach he adopts, uses evidence from studies in the field of cognitive neuroscience to help build his model. “The human brain is basically a huge collection of neurons. These neurons are connected through axons. When a neuron fires, it passes activation or inhibition along these axons and across synapses to all the other neurons with which it is connected. This passing of information occurs in an all-or-none fashion. There is no method for passing symbols down axons and across synapses. Brain waves cannot be used to transmit abstract objects such as phrase structures. Rather, it appears that the brain relies on a type of computation that emphasizes patterns of connectivity and activation. Models based on this type of computation are called ‘connectionist’ models….. A fundamental feature of these models is that they view mental processing in terms of interaction and connection, rather than strict modularity and separation. Although connectionist models often postulate some types of modules, they tend to view these modules as emergent and permeable (MacWhinney, 1998), rather than innate and encapsulated (Fodor, 1983)” (MacWhinney, 2001: 80).

(iii) Input-driven Learning. Language learning can be explained in terms of input rather than innate principles and parameters. Cue validity is the key construct in this explanation. “The basic claim of the Competition Model is that the system of form-function mappings embodied in language-processing networks is acquired in accord with a property we will call cue validity. .. The single most common interpretation of cue validity is in terms of the conditional probability that an event X will occur given a cue Y, that is p(XY). If this probability is high, then Y is a good cue to X. The most straightforward prediction from this initial analysis is that forms with a high conditional probability should be acquired early and be the strongest determinants of processing in adults” (MacWhinney, 1997: 121). MacWhinney adds in a later paper that “the most basic” determinant of cue strength is task frequency, while “the most important and most basic cue validity dimension is the dimension of reliability. A cue is reliable if it leads to the right functional choice whenever it is present” (MacWhinney, 2001: 75).

(iv) Capacity. Short-term verbal memory has limited capacity and the use of language in real time is continually subject to these limitations. “The Competition Model focuses on the role of underlying conceptual interpretation in determining the utilization of processing capacity” (MacWhinney, 1997: 115). “Although our results for online processing are still far from complete, we now have the outlines of a Competition Model approach to real-time sentence processing. This account treats sentence interpretation as a constant satisfaction process that balances the limitations imposed by verbal memory against the requirements of conceptual interpretation. Our raw memory for strings of nonsense words is not more than about four. However, when words come in meaningful groups, we can remember dozens of words, even when the message is unfamiliar. The most likely candidate for this additional storage is some form of conceptual representation. We…..claim that words are quickly converted into integrated conceptual representations through a process of structure building (Gernsbacher, 1990). This process begins with the identification of a starting point (MacWhinney, 1977), or perspective, from which the entire clause can be interpreted. In English, this is usually the subject” (MacWhinney, 1997: 133).

In brief, the Competition Model argues that language encodes functions like ‘topic’ and ‘agent’ onto surface grammatical conventions in various ways such as word order and subject-verb agreement. Because of the limits on processing, these functional categories compete for control of the surface grammatical conventions. Speakers of languages use four types of cues – word order, vocabulary, morphology, and intonation – to facilitate their interpretation of the these form-function mappings. Because of the principle of limited capacity mentioned above, human languages find different ways of using these cues. A central concept in the Competition Model is that speakers must have a way to determine relationships among elements in a sentence. Language processing involves competition among various cues, each of which contributes to a different resolution in sentence interpretation. Although the range of cues is universal, there is language-specific instantiation of cues, and language-specific “strength” assigned to cues. Another way of putting this is to say that language forms are used for communicative functions, but any one form may realise a number of functions, and any one function can be realised through a number of forms.

In English, for example, word order is very typically SVO in active declarative sentences, and, it is argued, word order is a strong cue for the realisation of many functions. Bates and MacWhinney claim that in Romance languages like Italian and Spanish, however, word order is not so important: they rely more on morphological agreement, semantics and pragmatics. Within a language, the cues often converge to give a clear interpretation of a sentence. In the English sentence “John kicks the ball.” the cues are word order (SVO), knowledge of the lexical items, the animacy criteria (balls do not kick), and subject-verb agreement. But sometimes there is competition among the cues that signal a particular function. For example, in the sentence “That teacher we like a lot.” there is competition between “teacher”, “we” and “lot” for agency. “Lot” can be eliminated because it is inanimate and follows the verb. “We” wins, because although “teacher” is in the optimum position, “we” is in the nominative case and because it agrees in number with the verb.

So far, the discussion holds for both first and second language learning. Turning to SLA, since the connectionist view assumes that all mental processing uses a common interconnected set of cognitive structures, this implies that transfer plays a key role. “the early second language learner should experience a massive amount of transfer from L1 to L2. Because connectionist models place such a strong emphasis on analogy and other types of pattern generalisation, they predict that all aspects of the first language that can possibly transfer to L2 will transfer. This is an extremely strong and highly falsifiable prediction. However, it seems to be in accord with what we currently know about transfer effects in second language learning” (MacWhinney, 1997: 119).

The Competition model claims that the second language learner begins learning with a parasitic lexicon, a parasitic phonology, and a parasitic set of grammatical constructs. “Over time, the second language grows out of this parasitic status and becomes a full language in its own right.” (MacWhinney, 1997: 119. As far as the lexicon is concerned, “this development is explained by the strengthening of direct associations from the L2 phonological form to the underlying referent, and by the restructuring of the meanings of some words. If two words in L1 map onto a single word in L2, the basic transfer process is unimpeded. It is easy for a Spanish speaker to take the L2 English form know and map it onto the meanings underlying saber and conocer (Stockwell, Bowen and Martin, 1965): What is difficult is for the L1 English speaker to acquire this new distinction when learning Spanish. In order to correctly control this distinction, the learner must restructure the concept underlying know into two new, related structures” (MacWhinney, 1997: 120).

In phonology, the L2 learner has to gradually “undo” the inappropriate direct transfer that occurs in the early stages of learning. In grammar, “the weights connecting functions to clusters of forms must be retuned.” MacWhinney, 1997:120) Sometimes, the L2 requires the learner to make new conceptual distinctions not present in the L1. In order to acquire this new category, the L2 learner begins by attempting to transfer from the L1, and in case of difficulties the learner is “resigned to picking up the pieces of this new category one by one and restructuring them into a working system” (MacWhinney, 1997: 121).

The second language learner’s task is thus seen as adjusting the internal speech-processing mechanisms from those appropriate to his L1 to those appropriate for the target language. Ellis, in his treatment of the Competition Model, puts it another way – the learner has to discover the particular form-function mappings that characterise the target language. The task facing the L2 learner is to discover (1) which forms are used to realise which functions in the L2, and (2) what weights to attach to the use of individual forms in the performance of specific functions. (Ellis, 1994: 375) Ellis comments that the question is: how does the learner do this? Does he use the same cues and the same weights as in his L1, or different ones? MacWhinney’s 1997 account goes some way to answering that question: the learner does it by massive transfer, and by then making the necessary adjustments on the basis of the input. The end result of this process of restructuring “is the tightening of within-language links in contrast to between-language links. In this way, a certain limited form of emergent linguistic modularity is achieved” (MacWhinney, 1997: 120).

The Competition Model rests on an empiricist view which attempts to do without non-sensory knowledge, and reduce learning to associationism. It’s important to stress that the Competition Model is based on a commitment to empiricism, but not an empiricism that refuses to consider causal explanations and attempts to rid observation of all “theoretical bias”. The empiricism Bates and MacWhinney champion talks of “mental processes” (though, true to the tradition of empiricism, it prefers to treat mental processes as far as possible as “neurological facts”), “conceptual interpretation”, “processing capacity”, “universals of cognitive structure” and indeed “general explanation”. What is most encouraging is that MacWhinney concludes his 1997 paper by saying “The wise reader will take these arguments for an empiricist position with a healthy grain of salt. We all know that the most reasonable and tenable positions on major issue, such as nativism versus empiricism, inevitably rest somewhere in the middle between the two extremes. However, it is often helpful to view the competing positions in their most undiluted form, so that we can navigate between these alternatives, coming always a bit closer to the truth” (MacWhinney, 1997: 137). Amen to that.

The Competition Model is coherent, cohesive, consistent, and its terms are reasonably well-defined. Furthermore, its hypotheses are precise and have a great deal of empirical content – as we would expect! As a result, researchers have been able to carry out many empirical studies that test the hypotheses. The basic test format used in most of the numerous studies of the Competition Model was to present L2 learners whose native language uses cues and cue strengths that differ from those of the L2 with sentences designed to offer competing cues. The learners were asked to say what the subject of the sentence was. The analysis of the results were based on “choice” (which nouns the subjects chose) and “latency” (the time taken to make the choice). The studies found that L2 learners are indeed faced with conflicts between native language and target language cues and cues strength, and that, to resolve the conflict, they first resort to their L1 processing strategies when interpreting L2 sentences. For example, English learners of Japanese initially made use of rigid word order as a cue: their initial hypothesis was rigid word order. Their next task was to figure out that in Japanese the order is SOV – which they then rigidly applied. On encountering incongruities, learners often resorted to meaning-based cues as opposed to word order, or morphology-based cues. In general, the studies strongly suggest transfer and indicated that the processing strategies of the L2 learners could be located between the two poles represented by the strategies used by native speakers of the two languages involved.

Unfortunately, the research methodology used in the studies is not without its problems. The task that forms the basis of the tests is extremely artificial. This is not in itself enough to invalidate the research (much of the work done in UG could be similarly criticised), but it does make it difficult to be sure that the analysis of the results is valid. McLaughlin and Harrington (1989) suggest that, since many of the sentences used in the studies are extremely deviant, there is the possibility that the wrong thing is being tested. Perhaps subjects are not processing such sentences as they would in actual communicative situations, but are settling on a particular problem-solving strategy to get them through the many judgements of this nature they have to make (McLaughlin and Harrington, 1989, cited in Ellis, 1994: 378).

The theory certainly lays itself open to falsification; as MacWhinney himself argues, the basic claims of the Competition Model regarding transfer and cue validity effects in SLA are highly falsifiable. The clearest counter-evidence would be instances of strong cue use in L1 that failed to transfer to L2. “If transfer is possible and does not occur, the model would be strongly falsified” (MacWhinney, 1997: 131). MacWhinney lists over 30 studies that he, Bates, and others have conducted in over a dozen languages over a period of fifteen years on aspects of cue validity. A large number of other studies (e.g., Gass 1987, Harrington, 1987, Sasaki, 1991) have examined aspects of the model for SLA. In MacWhinney’s words “These studies have yielded a remarkably consistent body of results.” Ellis (1994), and Braidi (1999), for example, agree that the Competition Model has survived empirical tests well. Most of the tests seem to confirm that different L1 users consistently use the same weighting of cues: word order is by far the most significant factor for English, for example. The studies on L2 learning give a great deal of support to the hypothesis that transfer of L1 weightings to L2 is an important feature of SLA.


Many in the field of SLA see the Competition Model in particular, and connectionist approaches in general, as being one of the most promising areas of all for SLA. The model is, of course, associated with connectionism, a movement in cognitive science which attempts to explain human intellectual abilities using artificial neural networks. Neural networks are simplified models of the brain, composed of large numbers of units (the analogs of neurons) together with weights that measure the strength of connections between the units. The central task of connectionist research is to find the correct set of weights to accomplish a given task by “training” the network. An early connectionist model was a network trained by Rumelhart and McClelland (1986) to predict the past tense of English verbs. The network showed the same tendency to overgeneralise as children, but there is still no agreement about the ability of neural networks to learn grammar. The interest in connectionism is that it may provide an alternative to the modular theory of mind. If it can be shown that these artificial networks can “learn”, then successive advances in what is known about the brain – which is seen as a neural network comprised of neurons and their connections (synapses) – may be enough to explain cognitive processes and learning without recourse to the “black box” of the mind.


Bates, E. and MacWhinney, B. (1987) Second language acquisition from a functionalist perspective: pragmatic, semantic, and perceptual strategies. In Winitz, H. (ed.) Native language and foreign language acquisition. Annals of the New York Academy of Sciences.
Braidi, S. M. (1995) Reconsidering the role of interaction and input in second language acquisition. Language Learning 45, 141-75.
Chomsky, N. (1959) Review of Skinner’s Verbal Behaviour. Language, 35 26-58.
Chomsky, N. (1965) Aspects of the theory of syntax. Cambridge, Mass.: MIT Press.
Ellis, R. (1994) The study of second language acquisition. Oxford: Oxford University Press.
MacWhinney, B. and Bates, E. (eds.) (1989) The crosslinguistic study of sentence processing. Cambridge: Cambridge University Press
MacWhinney, B. (1997) Second Language Acquisition and the Competition Model. In de Groot, A. B. M. and Kroll, J. F. (eds) Tutorials in Bilingualism: Psycholinguistic Perspectives. Hillsdale, N.J. Erlbaum.
Rumelhart, D. and McClelland, J. (1986) On learning the past tense of English verbs. In McClelland, J. and Rumelhart, D. (eds.) Parallel Distributed Processing: Explorations in the microstructure of cognition. Cambridge, Mass.: MIT Press.
Skinner, B. F. (1957) Verbal behavior. New York: Appleton-Century-Crofts.

Hugh Dellar and The Lexical Approach Part 2

In the previous post, I looked at Hugh’s attempts to implement Michael Lewis’ “Lexical Approach”, and posed a few questions. Anticipating no response from Hugh, I addressed some of the questions myself in comments on that post, but let me here give a fuller account. In what follows I’ll refer to Hugh by his surname Dellar.

Nattinger and DeCarrico and “the lexical phrase”

Let’s start with an alternative to Lewis’ lexical approach. Nattinger and DeCarrico (1992) argue that the examination of big corpora by computers (which suddenly made it possible to do concordance searches of huge selections of text in minutes) suggests that “the lexical phrase” is at the heart of the English language. They argue that linguistic knowledge cannot be strictly divided into grammatical rules and lexical items, that rather, there is an entire range of items from the very specific (a lexical item) to the very general (a grammar rule), and since elements exist at every level of generality, it is impossible to draw a sharp border between them. There is, in other words, a continuum between these different levels of language.

The suggested application of Nattinger and DeCarrico’s (1992) argument to language teaching is that lexis – and in particular the lexical phrase – should be the focus of instruction. This approach rests on two main arguments. First, some cognitive research (particularly in the area of Parallel Distributed Processing and related connectionist models of knowledge) suggests that we store different elements of language many times over in different chunks. This multiple lexical storage accords no privilege to parsimonious, non-redundant systems. “Rather, they assume that redundancy is rampant in a model of language, and that units of description, whether they be specific categories such as “word” or “sentence”, or more general concepts such as “lexicon” or “syntax” are fluid, indistinctly bounded units, separated only as points on a continuum” (Nattinger and DeCarrico, 1992). If this is so, then the role of analysis (grammar) in language learning becomes more limited, and the role of memory (the storage of, among other things, lexical phrases) more important. I’m not saying for a moment that this view of language and language learning is correct, but it’s an interesting alternative to Lewis.

The second argument is that some research suggests that formulaic language is highly significant. Peters (1983) and Atkinson (1989) show that a common pattern in language acquisition is that learners pass through a stage in which they use a large number of unanalyzed chunks of language – prefabricated language. This formulaic speech is seen as being basic to the creative rule-forming processes which follow. Starting with a few basic unvarying phrases, first language speakers subsequently, through analogy with similar phrases, learn to analyze them as smaller patterns, and finally into individual words, thus finding their own way to the regular rules of syntax.

Both these arguments deserve serious attention, and indeed have received attention in many quarters, including Bates and MacWhinney’s development of their Competition Model (see, for example MacWhinney, 2002) and Skehan’s (1988) very interesting work on unpacking and re-packing formulaic chunks. They indicate that principled arguments can be made, and, in my opinion, they highlight the weaknesses of Michael Lewis’ “Lexical Approach” (1993) which takes a much more strident, less nuanced view. Lewis (1993, 1996) says that “grammar is not the basis of language acquisition, and the balance of linguistic research clearly invalidates any view to the contrary” and flatly proclaims that “Language is not lexicalised grammar, rather it is grammaticalised lexis”. Lewis (borrowing piecemeal from work done by Pauley and Syder, Nattinger and DeCarrico, Sinclair, Biber, Willis and others) asserts that native speakers have a vast stock of lexical prefabricated items or chunks, and that fluency depends on having rapid access to this stock of lexical chunks. Since lexis is central in creating meaning, and grammar plays a secondary role in managing meaning, teachers should devote themselves to teaching lexical chunks.


Hoey’s Lexical Priming Theory

Lewis doesn’t develop his assertions about the nature of language, but Hoey, a serious scholar in the field of corpus-based linguistics, decided to have a go at a theory of language in his 2005 book “Lexical Priming: A New Theory of Words and Language”. The book gives a marvellous account of what it is to know a word, but fewer than 20 pages are given over to explaining how we get this knowledge, and I get the strong impression that he’s not really interested in all this psycholinguistic memory stuff – he’d much rather talk about what all the data tells us about patterns of text and the fascinating links between words. So anyway, Hoey says that we get all this knowledge by “subconsciously noticing” everything that we have ever heard or read, and storing it all in a massively repetitious way. “The process of subconsciously noticing is referred to as lexical priming. … Without realizing what we are doing, we all reproduce in our own speech and writing the language we have heard or read before. We use the words and phrases in the contexts in which we have heard them used, with the meanings we have subconsciously identified as belonging to them and employing the same grammar. The things we say are subconsciously influenced by what everyone has previously said to us.”

The whole theory hinges on the construct “noticing”, a construct which Hoey says almost nothing about. Without more work, Hoey’s theory is circular and thus empty – it explains everything we know by describing everything we know. To the extent that it simply claims that language learning is the result of repeated exposure to patterns of text, and the more the repetition the better the “knowledge”, then it’s a crude version of behaviourism. As Chomsky argued when he demolished Skinner’s behaviourist view of language and learning, language use is “stimulus independent” and “historically unbound”. It’s stimulus independent because, Hoey notwithstanding, virtually any words can be spoken in response to any environmental stimulus. It’s historically unbound because what we say, again Hoey notwithstanding, is not determined by our history of priming, as is made clear by the fact that we can and do say things that we haven’t been trained to say and that we have never heard anybody else say.

I’ve said elsewhere that I think Chomsky’s UG has very little relevance to an explanation of SLA, but I think it’s an excellent theory of language, unlike Hoey’s. Any theory of language has to confront the question posed by Chomsky which is “How do children acquire aspects of the language which they have never been exposed to?” Thousands of studies (sic) have shown that despite the fact that certain properties of language are not explicit in the input, children possess knowledge of grammaticality, ungrammaticality, ambiguity, and paraphrase relations, for example. The claim made by Hoey that children learn language starting from a ‘blank slate’ and then building knowledge from subconsciously noticed connections between lexical items simply cannot survive the counter-evidence provided by studies of children’s knowledge of language (including knowledge of its grammar, among other things) which doesn’t come from exposure to it.

What theory of SLA informs the lexical approach?

Lewis doesn’t even attempt an explanation of SLA; like Hoey, he accepts Krashen’s explanation of SLA, and I’ve dealt with this in the post “Krashen 1: Hoey, The Monitor, Lexis” which is in the menu on the right of the screen. It’s worth pointing out that the Natural Order Hypothesis contradicts Hoey’s new lexical priming theory, since, while the first claims that SLA involves the acquisition of grammatical structures in a predictable sequence, the second claims that grammatical structures are lexical patterns and that there is no order of acquisition.


Interlanguage Grammar versus Lexical Priming

In the last 40 years, great progress has been made in developing a theory of SLA based on a cognitive view of learning. It started in 1972 with the publication of Selinker’s paper where he argues that the L2 learners have their own autonomous mental grammar (which came to be known, as interlanguage (IL) grammar), a grammatical system with its own internal organising principles, which may or may not be related to the L1 and the L2.

One of the first stages of this interlanguage to be identified was that for ESL questions. In a study of six Spanish students over a 10-month period, Cazden, Cancino, Rosansky and Schumann (1975) found that the subjects produced interrogative forms in a predictable sequence:

1. Rising intonation (e.g., He works today?),
2. Uninverted WH (e.g., What he (is) saying?),
3. “Overinversion” (e.g., “Do you know where is it?),
4. Differentiation (e.g., “Does she like where she lives?).

A later example is in Larsen-Freeman and Long (1991: 94). They pointed to research which suggested that learners from a variety of different L1 backgrounds go through the same four stages in acquiring English negation:

1. External (e.g., No this one./No you playing here),
2. Internal, pre-verbal (e.g., Juana no/don’t have job),
3. Auxiliary + negative (e.g., I can’t play the guitar),
4. Analysed don’t (e.g., She doesn’t drink alcohol.)

In developing a cognitive theory of SLA, the construct of interlanguage became central to the view of L2 learning as a process by which linguistic skills become automatic. Initial learning requires controlled processes, which require attention and time; with practice the linguistic skill requires less attention and becomes routinized, thus freeing up the controlled processes for application to new linguistic skills. SLA is thus seen as a process by which attention-demanding controlled processes become more automatic through practice, a process that results in the restructuring of the existing mental representation, the interlanguage. The adoption of such a framework gives focus and strength to the research: well-defined problems can be articulated, and other more powerful and daring solutions can be offered to the one that has been tentatively established.

Any lexical approach which adopts Hoey’s theory must either state that a cognitive theory of SLA based on the development of an interlanguage is misguided, an extraordinary example of hordes of scholars marching down the wrong road and pursuing a chimera for 40 years and more, or they must re-work the whole programme so as to replace the misguided grammatical structures with lexical chunks. This, I suggest, highlights the implausibility of a lexical approach which uses Hoey’s theory of language.


As seen above, Hoey’s construct of “noticing” has nothing in common with the usual meaning of the word or with Schmidt’s construct of the same name. It is a sub-conscious process which we are, Hoey insists “entirely unaware of”: we notice things about words “without realizing what we are doing.” Dellar, nevertheless, has adopted lexical priming without ditching the contradictory sense of noticing used by Lewis. When Dellar says “fossilisation can result from saying things in L2 using L1 primings, communicating meaning but not noticing the gap” he’s using noticing to mean something like “being consciously aware of” or “giving conscious attention to”. But any “gap” between a student’s current knowledge of lexis and a native speaker’s knowledge, can’t be shortened by drawing the student’s attention to it, because priming is subconscious. Of course, Hoey’s theory is wrong, but Dellar must sooner or later choose between Hoey’s construct of noticing and Schmidt’s, because he can’t have both.

Schmidt (1990, 2001) attempts to do away with the “terminological vagueness” of “consciousness” by examining three senses of the term: consciousness as awareness, consciousness as intention, and consciousness as knowledge. In regard to consciousness as awareness, the two are often equated, but Schmidt distinguishes between three levels: Perception, Noticing and Understanding. The second level, Noticing, is the key to Schmidt’s eventual hypothesis: Noticing is focal awareness. I won’t go through all the steps of Schmidt’s argument here (see the page on “Processing Approaches to SLA” in the list on the right of the screen for an account), but, having given a very careful definition of noticing, Schmidt develops the distinction between input and intake by noting the distinction Slobin, and Chaudron make between preliminary intake (the processes used to convert input into stored data that can later be used to construct language), and final intake (the processes used to organise stored data into linguistic systems). Schmidt proposes that intake be defined as “that part of the input which the learner notices … whether the learner notices a form in linguistic input because he or she was deliberately attending to form, or purely inadvertently. If noticed, it becomes intake.” (Note that noticing something inadvertently is still a conscious act.) Those adopting a cognitive approach to SLA have accepted that Schmidt’s theory of noticing makes a major contribution to our understanding, and also has important teaching implications. However, the distinction between input and intake (which Lewis attempted to preserve) is completely at odds with Hoey’s lexical priming theory, a point I’ll return to shortly.

Lewis + Hoey = Dellar’s Lexical Approach??

Thornbury (1998) cites Richards and Rodgers (1986) who say that an approach “refers to theories about the nature of language and language learning that serve as the source of practices and principles in language teaching”. As I said in Part 1, Dellar needs to explain the theories of language and language learning which guide his version of the lexical approach. Dellar relies on Lewis’ theory of language, but, as Thornbury (1998) points out, it’s not clear what implications Lewis’ view of language have for syllabus specifications. Lewis rejects both a “grammatical PPP” syllabus, and, “given the holistic nature of language”, any step-by-step linear syllabus. Task-based syllabuses are also ruled out. But no proper alternative syllabus is proposed. To quote Thornbury again “While he provides examples of the kinds of activities such texts and discourses might be subjected to …, the failure to specify how such texts and discourses would be selected and organised makes it difficult to visualise how the Lexical Approach is operationalised in the long term. Lewis offers us the prospect of a journey, even an exciting one, but it is a journey without maps”.

Thornbury also cites Skehan (1998), who points out “there is a danger… that an exemplar-based system can only learn by accumulation of wholes, and that it is likely to be excessively context-bound, since such wholes cannot be adapted easily for the expression of more complex meanings’ (p. 89)”. That is to say that phrasebook-type learning without the acquisition of syntax is ultimately impoverished: all chunks but no pineapple. It makes sense, then, for learners to keep their options open and to move between the two systems and not to develop one at the expense of the other. The need is to create a balance between rule-based performance and memory-based performance, in such a way that the latter does not predominate over the former and cause fossilization’ (ibid. p. 288)”.

As to a theory of language learning, Lewis provides none, and neither does Hoey. Lewis borrows from Krashen in order to stress the importance of “comprehensible input” (whatever that is) and of unconscious acquisition rather than conscious learning, but at the same time, Lewis insists that conscious attention needs to be given to various aspects of the languiage. As Thornbury says, Lewis insists that ‘students need to develop awareness of language to which they are exposed’ (LA, p. 195)…., which suggests that he recognises a role for consciousness-raising (a position that Krashen would not accept). ‘Accurate noticing of lexical chunks, grammatical or phonological patterns all help convert input into intake’ (ILA, p. 53). The implication is that these noticed chunks are stored in memory and retrieved ‘undigested’, as it were. That is, they engage the learner’s item-learning capacity rather than the rule-based one. This places formidable demands on the learner’s memory: but, as we have seen, Lewis offers no clear guidelines as to selection and grading…How is one to achieve this enormous task?…. Lewis seems to assume that massive exposure will do the trick; ‘It is exposure to enough suitable input, not formal teaching, which is the key to increasing the learner’s lexicon’ (ILA, p. 197). If this is the case, then this raises the question as to whether many of the ‘teaching’ ideas included in Lewis’s books are redundant, and not only that, a drain on time that could be more usefully spent simply reading. (It also raises the selection-and-grading question yet again: what is this ‘suitable input’ and how is it organised?).”

Note that Thornbury wrote the article before Hoey published his Lexical Priming theory. Thornbury rightly points to Lewis’ inconsistency in adopting Krashen’s acquisition/learning distinction while at the same time claiming that input becomes intake as the result of conscious noticing, but Thornbury wasn’t to know that Hoey would later insist that the noticing which takes place in lexical priming is subconscious. There is, as I said earlier, now a clear contradiction between noticing as Lewis (following Schmidt) used it, and noticing as Hoey uses it. Since Dellar now uses L1 priming and L2 priming to diagnose and prescribe, it follows that he’s bound to take Hoey’s view of noticing. Both good news and bad news follow this forced choice. The good news is that Thornbury’s problem about how learners can possibly be expected to consciously attend to the masses of information contained in lexical chunks is solved, because it’s all done subconsciously. The bad news is that it makes Thornbury’s remark about the redundancy of the teaching ideas in Lewis’s books even more salient. Dellar thus confronts the uncomfortable decision about what to do with all his teaching ideas, including those aimed at helping learners to consciously “notice the gap”. Hoey’s theory implies that most teaching is a waste of time because a sufficent condition for SLA is masses of input – lexical priming will do the rest. The implication of Hoey’s theory is that all language learning is implicit: we acquire communicative competence in an L2 without realizing what we’re doing.


In my opinion, Dellar has been too uncritical of Lewis’ lexical approach and too hasty in his adoption of Hoey’s lexical priming theory. Lewis threw the baby out with the bathwater. If you’re going to be so radical, then you need to put something decent in place of what you throw out, and Lewis failed to do so. To repeat Thornbury’s well-expressed assessment: “Lewis offers us the prospect of a journey,… but it is a journey without maps”. Dellar has made no attempt to deal with the weaknesses, gaps and inconsistencies in Lewis’ lexical approach. He continues to tell teachers what language “really” is, and what they should and shouldn’t do in class (conversations must be given priority; don’t teach single words; what gets you from intermediate to advanced isn’t grammar, it’s layer upon layer of lexis) all based on the dubious assumption that Lewis provides a matchless blueprint for ELT. Dellar has made things worse for himself by talking about lexical primings, L1 primings, L2 primings, and so on, as if Hoey’s theory were now part of some new, enriched lexical approach.

Surely both Lewis’ strident claims and Hoey’s lexical priming theory should be dismissed as wrong and unhelpful. Surely the work of Nattinger and DeCarrico, Pauley and Syder, Peters, Biber, Sinclair and others is more likely to provide the “theories about the nature of language and language learning that serve as the source of practices and principles in language teaching”, as Richards and Rogers put it. Widdowson (1989), attempting a synthesis of the various strands of work in corpus linguistics, suggests that communicative competence can be seen as “a matter of knowing a stock of partially pre-assembled patterns, formulaic frameworks, and a kit of rules, so to speak, and being able to apply the rules to make whatever adjustments are necessary according to contextual demands. Communicative competence is a matter of adaption, and rules are not generative but regulative and subservient”. In a different text, Widdowson (1990) says “Competence consists of knowing how the scale of variability in the legitimate application of generative rules is applied – when analysis is called for and when it is not. Ignorance of the variable application of grammatical rules constitutes incompetence”. To quote from Thornbury (1998) again: “In other words, two systems co-exist: a formulaic, exemplar-based one, and a rule-based analytic one”.

If, as Widdowson thinks, we should provide patterns of lexical co-occurrence for rules to operate on so that they are suitably adjusted to the communicative purpose required of the context, then Nattinger and DeCarrico’s work, which identifies lexical phrases and then prescribes exposure to and practice of sequences of such phrases, might be of use. They present a language teaching program based on the lexical phrase which leads students to use prefabricated language but which doesn’t rely too heavily on either theories of linguistic competence on the one hand or theories of communicative competence on the other. “Though the focus is on appropriate language use, the analysis of regular rules of syntax is not neglected” (Nattinger and DeCarrico, 1992). Is this not a more reasonable, and a more attractive approach? I think it’s only the germ of an approach, but it could be the start of an interesting journey, and one that might grab the attention of people in our profession who are capable of making good maps.


Cazden, C., Cancino, E., Rosansky, E. and Schumann, J. (1975) Second language acquisition sequences in children, adolescents and adults. Final report submitted to the National Institute of Education, Washington, D.C.

Hoey, M. (2005) Lexical Priming: A New Theory of Words and Language. Psychology Press.

Krashen, S. (1985) The Input Hypothesis: Issues and Implications. Longman.

Larsen-Freeman, D. and Long, M. H. (1991) An introduction to second language acquisition research. Harlow: Longman.

Lewis, M. (1993) The Lexical Approach. Language Teaching Publications.

Lewis, M. (1996) Implications of a lexical view of language’. In Willis, J,, & Willis, D. (eds.) Challenge and Change in Language Teaching, pp. 4-9. Heinemann.

Lewis, M. (1997) Implementing the Lexical Approach. Language Teaching Publications.

MacWhinney, B. (2002) The Competition Model: the Input, the Context, and the Brain. Carnegie Mellon University.

Nattinger, J., & DeCarrico, J. (1992) Lexical Phrases and Language Teaching. Oxford University Press.

Pawley, A. & Syder, F. (1983) Two puzzles for linguistic theory: nativelike selection and nativelike fluency’. In Richards, J., & Schmidt,
R. (eds.) Language and Communication, pp. 191-227. Longman

Peters, A (1983) The Units of Language Acquisition. Cambridge Universiy Press.

Richards, J., & Rodgers, T, (1986) Approaches and Methods in Language Teaching. Cambridge University Press.

Schmidt, R. (1990) The role of consciousness in second language learning. Applied Linguistics 11, 129-58

Schmidt, R. (2001) Attention. In Robinson, P. (ed.) Cognition and Second Language Instruction. Cambridge: Cambridge University Press.

Sinclair, J. (ed.) (1987) Looking Up. Harper Collins.

Skehan, P. (1998) A Cognitive Approach to Language Learning. Oxford University Press.

Thornbury, S. (1998) The Lexical Approach; a journey without maps? MET, Vol. 7, No. 4.*

Widdowson, H. (1989) ‘Knowledge of language and ability for use’. Applied Linguistics, 10, pp. 128-37

* Note that Scott has made this article and many others of his available for free download here: http://www.scottthornbury.com/articles.html

Hugh Dellar and The Lexical Approach


Hugh Dellar is a blogger , a coursebook writer, a frequent conference speaker, a teacher trainer, and an EFL teacher. He’s perhaps best known for his promotion of the lexical approach. While I know how strongly he feels about the importance of lexis in ELT, I’ve become increasingly confused about what he thinks the lexical approach is and how he thinks it should be implemented. It all started on Sunday when I read a tweet from Hugh who was at the IATEFL Poland conference. He wrote:

“intensive & focused pure lexical syllabus can help break down the fossilisation that result from bringing L1 primings into L2”.

What I didn’t immediately realise was that he was in the audience at a presentation, tweeting bits as soon as they came out of the presenters’ mouths. So great is Hugh’s enthusiasm for the lexical approach, he just can’t wait to spread the good news! Anyway, I tweeted

What’s a “pure lexical syllabus”? What fossilisation results from “L1 priming”? Sit down and have a glass of water Hugh.

Hugh: don’t think it’s too controversial to dub a syllabus which features only lexis & no explicit grammar teaching “pure” myself.

Me: You can dub it anything you like. What it is – apart from “only lexis and no explicit grammar”?

Hugh: if you mean what goes in it, that’s obviously open to debate. It is what it is though whatever else you’d rather call it.

Me: Pure nonsense!

Hugh: says you. And not sure they’d the most helpful way of furthering the debate you were after.

That’s as far as I got in my attempt to find out what a pure lexical syllabus might be. As for fossilisation,

Hugh said: fossilisation can result from saying things in L2 using L1 primings, communicating meaning but not noticing the gap.

Me: You’ve used 4 constructs in that answer and all of them carry a lot of theoretical baggage. Result: highly-debateable assertion.

Hugh: most assertions are debatable aren’t they? We work in terrain not blessed with many concrete facts.

Apart from an equally unsuccessful attempt to find out how lexical priming fitted in to Hugh’s evolving view of language and ELT, that was that. So I went and had a look at Hugh’s blog. What, I wondered, was a “pure lexical syllabus” and how can it rectify the fossilisation that results from “saying things in L2 using L1 primings, communicating meaning but not noticing the gap”? Eventually, I found a presentation where Hugh had recorded himself talking about “Teaching Grammar Lexically”.


The presentation starts with Hugh telling us how, in his Celta course, he was taught to teach under the tyranny of “PPP – grammar teaching”. Hugh explains: “This was based on Chomsky and the whole idea of structuralist grammar…..I realise now that this is an outmoded and outdated way of thinking about grammar”. During the presentation Hugh makes references to “structuralist grammar”, “structural grammar”, “Chomsky and grammar”, “that sort of grammar”, …..

By “Structuralist grammar” I think Hugh means structuralism, a school of linguistics associated with Saussure and Bloomfield. Structuralists took a descriptive view of their job and limited themselves to the grand task of describing and classifying languages all over the worldin terms of well-defined linguistic units (although while Saussure remained faithful to this mission, Bloomfield allowed behaviourism to pervade American structuralism). Chomsky, of course, had a very different view of linguistics, and a very different view of grammar. Structuralism and Universal Grammar are thus not, pace Hugh, synonymous but rather diametrically opposed. Furthermore, UG represents a theory of language which provides an explanation of how we learn language, suggests that all natural languages share the same underlying properties, and has resulted in extraordinary scientific advances, especially in the area of developing artificial languages. It’s been the dominant paradigm in linguistics for the last 50 years, it has absolutely nothing to do with “the tyranny of PPP” or with any pedagogical grammar, and most linguists working today would beg to differ with the opinion that it’s an outmoded and outdated way of thinking about grammar.

What Hugh demonstrates here is an ignorance of theories of language, which is worrying for someone proselytising one very specific theory of language. I think Hugh means to say that PPP (the presentation and practice of discrete points of grammar) is an outdated way of teaching EFL / ESL. Let’s proceed. Hugh realised that PPP was a tyranny when he did his DELTA course and read Michael Lewis’ book “The Lexical Approach”, a book which changed his life by introducing him to a new way of seeing language and of teaching EFL. The key to the lexical approach is that “Language is not lexicalised grammar, rather it’s grammaticalised lexis. First and foremost it’s lexis that carries more meaning and drives communication”. The “profound shift in perspective” afforded by reading Lewis’s book shaped Hugh’s career; he’s spent 20 years unpacking this “very dense and meaty book”, which “takes time to filter down into teaching practice”. It’s been a struggle because the tyranny of PPP is so deeply entrenched that it’s hard to shake teachers out of it, but still, you’ve got to try, right?

The rest of the presentation consists of assertions about language and language teaching which are as confidently made as they are lacking in either evidence or argument. Here are some of them:

• PPP gives the illusion of progress
• Murphy’s books, the Headway series, English File, they’re all based on a false view of language
• PPP doesn’t work because students learn to talk about English, not in English
• the system creates grammar fear and grammar dependency
• focusing on structures in isolation distorts the reality of usage
• the best way to teach English is to “Keep it real” – teach what people really say in English, stick to typical contexts, focus on institutionalised sentences
• conversations must be given priority
• don’t teach single words

and on and on. Lots of supplementary assertions are also made, including these:

• Despite Chomsky, there are only 10 to 12 verbs you use in the future perfect
• We use will to make promises, to make decisions at the time of speaking, to make threats, OK? You know, predictions. These definitions are useless unless they’re rooted in a store of commonly used sentences that students have acquired and are able to use. From this they start to develop a coherent understanding of the functions and underlying semantics of the grammar.
• What really gets you from intermediate to advanced isn’t grammar. It’s layer upon layer of lexis.



Questions that need answering

I suggest that in order to have credibility as a teacher trainer and presenter of the lexical approach Hugh needs to publicly address these questions:

1. What theory of language informs the lexical approach? If it’s Hoey’s theory of lexical priming, how can it be tested? What studies have been done to test it? What evidence from studies supports it?

2. How do children acquire the ability to speak their native language according to lexical priming? How does Hugh counter the poverty of the stimulus argument?

3. What theory of SLA informs the lexical approach? Does Hugh agree with Hoey that lexical priming theory gives 100% support to Krashen’s Monitor theory? If so, how does Hugh deal with the circularity of Krashen’s constructs, and the fact that Krashen’s theory gives no significant role to explicit learning?

4. How does Hugh respond to the thousands of studies in SLA which support the construct of interlanguage? The evidence from these studies supports the claim that SLA is a cognitive process involving the acquisition of grammatical competence along a relatively fixed route. Does Hugh dismiss this evidence?

5. How does Hugh use the construct of “noticing” in his lexical approach? Does he think that “lexical chunks” can simply be substituted for the areas of language Schmidt discusses? Schmidt, after all, went to a great deal of trouble to explain what his theoretical construct “noticing” is (and isn’t), and it’s important to appreciate that noticing is a construct used to support the argument for the need for explicit learning of aspects of grammar.

6. How does Hugh use the construct of “fossilisation”? Is he aware that many, including Larson-Freeman recently, challenge the idea of any end state, and that Hoey himself says we never stop learning?

Once Hugh has given some account of what he thinks language is and how he thinks second languages are learned, he needs to then address the question of classroom teaching. I have said elsewhere that in my opinion Lewis’ book The Lexical Approach cobbles together a confused jumble of half-digested ideas; fails to offer any coherent or cohesive ELT methodology; and offers no proper syllabus, or any principled way of teaching the “chunks” which he claims are the secret to the English language. No doubt Hugh disagrees, but he has yet to present his own lexical syllabus. To describe a language in a particular way is one thing; to work out the best way to teach it in a classroom is another. Which is simply to say that you can’t get prescriptions from descriptions, however much Hugh might think you can.

Given Hugh’s conviction that Lewis is right to say that language is not lexicalised grammar but rather, grammaticalised lexis, the question remains: How do you teach it to a class? Apart from saying “give them lots of real language”; “don’t teach single words”; “you MUST use conversation”; etc., and reeling off dozens of authentic utterances like I’ll see you later; I’ll see what I can do; This won’t hurt at all; That’ll do; I’ll be back in a minute; I’ll pay you back tomorrow; ..., how do you organise a 100 hour course based on a lexical approach? What’s needed is a syllabus.

Breen suggests that a syllabus can be organised in response to these questions:

1. What knowledge does it focus on?
2. What capabilities does it focus on?
3. On what basis does it select and subdivide?
4. How does it sequence what is to be learned?
5. What is its rationale?

The first question is important because I, like many, don’t think that knowledge of attested behaviour (which is what we get from looking at corpora of what people say) is the same as our knowledge of language. Hoey goes to great lengths to explain what’s involved in knowing a word (sic), but he restricts himself to what’s performed and ignores the possible. Despite Hoey (and Hugh’s simplistic paraphrase “don’t teach the possible, teach the probable”), most modern linguists find it important to address the questions of “externalised and internalised” language and of valency. Furthermore, most linguists, both pure and applied, agree that language is a cognitive, inventive process, and that when we speak of competence in a language, we refer to something close to Bachman’s model, a cluster of competences not best explained by any theory of lexical priming.

The other questions involve setting out not just the “what” but the “how” of classroom-based teaching. I presume that Hugh doesn’t want to substitute the PPP of discrete points of grammar for the PPP of lexical chunks. So what happens? How are the classes which make up the syllabus conducted? What are the roles of the teachers and learners? As many will know, Breen suggests that syllabuses can be divided into 2 types: product and process syllabuses, and he argues that process syllabuses are better. I look forward to Hugh telling us what his lexical syllabus looks like, and whether he thinks it represents a product or process syllabus, or something else entirely.

I understand that Hugh is soon to launch “LexLab”, which will be place where all those interested in a lexical approach can share their ideas. I suggest that Hugh can hardly launch such an ambitious project without giving a clear account of the lexical approach which addresses questions about the nature of language; L1 acquisition; SLA; the various competencies involved in communicative language ability; the role of noticing, fossilisation, and the lexical syllabus.

Four Funerals and a Wedding

The postman’s just delivered the 3rd edition of Richards and Rogers “Approaches and Methods in Language Teaching”, first published in 1986. I’m surprised to see that Multiple Intelligences is in the “Current approaches and methods” chapter, and that the chapter on all those crazy 1970s methods is still there, after all these years. I think it’s time we said goodbye to them all, so allow me a few nostalgic words before we commit them all to the worms or flames.

Funeral 1: The Silent Way

Have a look at this: http://www.youtube.com/watch?v=85P7dmPHtso

Pretty spooky, eh? Galeb Gattegno invented this method in the late 70s. I heard about it in 1982 when I went to a Silent Way demo in Barcelona. The teacher taught us a bit of Polish, and what I remember most about the session is that we, the students, were utterly exhausted after 20 minutes. It’s an incredibly demanding method! The three vaunted tenets of the approach are: 1. Learners must discover (rather than remember or repeat); 2. Learning is aided by physical objects; 3. Learning is problem-solving. Well, maybe, but the crux is this: the teacher stays as silent as possible throughout the class.

Language is taught by working with sentences which are sequenced grammatically from easy to difficult. Materials consist of special phonetic charts, a pointer used with the charts, and Cuisenaire rods (small coloured blocks of varying sizes). Each new item of the language is introduced by the teacher who clearly models it once (and once only!) and learners are then guided in using the new item and incorporating it into their existing stock of language. For example, the teacher says “Give me the blue rod” pointing to each phoneme on the charts as she says the sentence. Then, pointing to the phonemes again, she gets everybody to practice the sentence. Then she indicates to a student to say the sentence to her. The student says the sentence and the teacher gives him the blue rod. Then students can practice among themselves, pushing out to incorporate other pronouns, other colours, make the negative and interogative, make Wh questions etc. After I don’t know how long, you get to practice the present perfect (“I’ve given Jim the blue rod.”) as they were doing in the YouTube clip.

The few teachers I met who actually practiced the Silent Way were rather like people I’ve met who practice Scientology: weird, hyped-up fanatics. Despite having a few grains of truth mixed up in its mad methodology, the basic flaw in the Silent Way is the silent bit. Any ELT method based on the assumption that a teacher is capable of remaining largely silent when in charge of a class is obviously doomed to failure; it’s as naive as assuming that politicians will remain largely honest when given power. The method also assumes that learners are willing to suffer prolonged mental and emotional stress; that learning a language doesn’t need any real communication to take place in the classroom; and that utterances such as “If I knew it was going to be like this, I wouldn’t have come” can be acquired via an approach which doesn’t seem equipped to go beyond the basics of the language.

Caleb Gattengo’s funeral was in 1988, and I reckon his method passed away at about the same time. RIP.



Funeral 2: Suggestopedia.

Suggestopedia is, without doubt, the weirdest approach ever. The trouble is, very few people have any first-hand knowledge of it; so, like the Ordo Templi Orientis, or Wittgenstein’s book club, we, the profane, have little to go on. In Spain, rumours about Suggestopedia were swirling around at about the same time as the Silent Way zealots were poking learners eyes out with their pointers. The version I heard was that a crazy Bulgarian educator called Georgi Lozanov was attracting nine hundred people every Saturday afternoon to Theatre 199 in Sofia, where he hypnotised them and they staggered out onto Rakovski Street 5 hours later speaking perfect English. Slightly more reliable information was available in the mid eighties when somebody close to the grand wizard managed to dodge the secret police, the searchlights and the snarling dogs, escape from Bulgaria, and set up a Suggestopedia Center in New York. The claim then was that Suggestopedia made it possible to learn English as a foreign language in 50 hours, compared to the 600 class contact hours the British Council claimed were needed to get to FCE level.

The approach is based on the idea that positive suggestion makes you more receptive and also stimulates learning. In order to achieve the relaxed, focused, optimum state for learning, Lozanov created an environment where the music, the chairs, the lighting, the colour of the wallpaper, everything contributes. So, when everyone is sitting comfortably, the lights dim, and in walks the grand maestro. When he’s centre stage, the opening bars of Tchaikovsky’s piano concerto ring out. Lozanov starts to read a long dialogue, taking both parts himself, allowing the music to be the protagonist, his voice acting as a counterpoint. He reads so that the rhythm and intonation of the text fits in with the rhythm of the music. After this first “concert reading”, a second, less formal reading is done, this time using a piece of Baroque music, Handel’s Water Music, for example. Next, Lozanov uses the text for more “normal” teaching purposes (don’t ask me what) and (don’t ask me how) everybody in the room memorises large chunks of the dialogues and “internalizes” them in such a way that they can use them to communicative ends. Go figure, as they say.

It’s surprising that Richards and Rogers include Suggestopedia in their historical review, given that nobody anywhere today is doing anything like the sessions Lozanov did, and there’s been no interesting fall-out either. There’s a looney bunch still working in New York (see Pearls World of Learning http://www.pearls-of-learning.com/suggest1_e.htm) who seem like a typical example of the “accelerated learning” programmes on offer, but these snake oil hustlers owe nothing to Lozanov’s reported sessions, and are certainly not based on his written work, most of which was confiscated by the communist thugs who ruled Bulgaria back then, and never released. Lozanaov was, by all accounts, a very singular man. He died in 2012 aged 86. RIP.



Funeral 3: Community Language Learning (CLL)

The guru of this method was Charles Curran, an American Jesuit priest, whose work in Counselling Learning was applied to ELT. Like the Silent Way, CLL had a band of devotees here in Spain who regarded their leader with something approaching religious awe.

Here’s how it works in ELT (adapted from One Stop English’s page on CLL). Students (12 maximum) sit in a circle. There is a tape recorder inside the circle. The teacher (the ‘Knower’ ) stands outside the circle. When a student has something they want to say in English (e.g. “Well, it’s Friday. What’s everybody doing tonight?”) they call the Knower over and whisper what they want to say, in their mother tongue. The teacher, also in a whisper, then offers the equivalent utterance in English. The student attempts to repeat the utterance, with encouragement and shaping from the Knower, with the rest of the group eavesdropping. When the Knower is satisfied, the utterance is recorded by the student. A student who wants to reply (e.g. “I’m going to the pub” or “Oh God! Do we have to talk about this?”) then calls the knower over and repeats the process, till there is a kind of dialogue recorded. The Knower then replays the recording, and transcribes it on the board. This is followed by analysis, and questions from students. In a subsequent session, the Knower may suggest activities springing from the dialogue. As the account in One Stop English puts it “Gradually, the students spin a web of language.”

The rationale for CLL is that it’s learner-centred and learner-controlled. Learners move from a stage of total dependence on the Knower to a stage of independent autonomy at the end, passing through 5 developmental stages along the way. The Knower provides a supportive and secure environment for learners, and encourages a whole-person approach to the learning.

The first time I saw a demo of CLL in 1983, I was very pleasantly surprised. A group of 8 adult business people at pre-intermediate level had an interesting, dynamic exchange of views about being a parent for about 45 minutes and the transcription of their conversation, once written up on the whiteboard, was exploited by the students and the teacher very well indeed. What was impressive (30 years ago!) was that there was no attempt to simplify the language and no attempt by the teacher to carry out any lesson plan: the students really were in charge, even when it came to analysing the transcript. I was so impressed that I decided to do a CLL class myself; inevitably, it was a disaster. The students felt silly and didn’t see the point; the tape recorder didn’t work properly; my Spanish wasn’t good enough to give a good translation; no real “topic” emerged; and when one of the students mentioned that she was divorced the whole thing collapsed into short and embarrassing exchanges in Spanish about what pigs men were. In the coffee break I had no defence against the students’ unanimous view that I’d let them down.

My own experience highlights a few of the problems involved in the CLL method. The teacher (Knower) not only has to be proficient in the language of the students, he/she also really needs counselling training. Generally speaking, if you wanted to be a CLL teacher, you had to be highly-trained, which unfortunately required a level of commitment (not to say faith) which most teachers, including me, were not prepared to give. Other problems with CLL are that It can only be done with small numbers of students; the students have to share a single mother tongue; it’s only suitable for adult learners, and, like the other methods discussed here, it focuses on the early stages of learning the new language.

Curran himself died in 1978, but from what I remember, CLL hit its high point in the early 90s when it was the buzz word at TESOL and IATEFL conferences; I could be wrong about that. In any case, it’s all over now, and I’m sure you’ll all be pleased to know that I’ve just held a quick funeral service for it in the garden, attended by 4 bemused dogs who happened to be nearby. RIP.



Funeral 4: Multiple Intelligences

Richards and Rogers discuss Total Physical Response as the fourth looney tune from the seventies, but I prefer to at least try to bury Multiple Intelligences (MI) which, (shame on you lads!) appears as a “current” approach in their book. We have Russ Mayne and his modest but hugely-influential talk at this year’s IATEFL conference, A guide to pseudo-science in English language teaching, to thank for highlighting the need to bury this nonsense once and for all. Actually, Russ didn’t have time to say how MI and the related NLP and learner style theories are used in ELT, but what he did do very well was say what’s wrong with them: there is no evidence to support them, they use poorly-defined pseudo-scientific jargon, and their claims are impervious to empirical tests. He also named and shamed the big shots who promote or at least condone MI and NLP and suggested that teachers take a more critical view of so-called expert opinion.

Gardner’s MI theory, which has undergone quite a few revisions, says there are various distinct intelligences, including linguistic, musical, logical-mathematical, spatial, bodily-kinesthetic, intrapersonal sense of self, naturalistic intelligence, and, most recently, “mental searchlight intelligence” and “laser intelligence”. The point of this is that most teaching, including ELT, is said to concentrate exclusively on linguistic and logical-mathematical intelligences, thereby ignoring the need for teachers to expand their repertoire of techniques, tools, and strategies. Gardner defines a skilled teacher as “a person who can open a number of different windows on the same concept”. This has proved influential on its own, but has also played a part in the development of theories of learner styles and NLP. It’s extraordinary to see Richards and Rogers not saying unequivocally that MI has no support whatsoever from research findings, and maybe not surprising, but still depressing, to see the British Council website making no critical comment whatsoever on its website page devoted to NLP, to see Mario Rinvolucri of Pilgrims offering teacher training courses in MI and learner styles, and on and on. Just to be clear: there have been no published studies that offer any evidence of the validity of the MI, and there is no reason whatsoever to believe a word Gardner or anybody else says about the efficacy of aiming one’s teaching at learners’ bodily-kinesthetic intelligence. Lynn Waterhouse gives a good review of MI in her article “Multiple Intelligences, the Mozart Effect, and Emotional Intelligence: A Critical Review Published online: 08 Jun 2010. http://dx.doi.org/10.1207/s15326985ep4104_1

Of course, in order to have a funeral service for MI, we need to nail it in its coffin. Russ has whacked a few nails in, and I hope this will help, but I’m not very confident when I say RIP this time.



The Wedding

They haven’t actually named the day yet, but everything indicates that the Process Syllabus (Prosylla) is finally going to hitch up with The Task-Based approach (Tasba). They’ve flirted with each other for over twenty years now, but Prosylla always had a tough time making friends, while Tasba never lacked for partners. But Tasba’s been reading Paulo Freire, Peter McLaren, Shirley Steinberg, Alexander Neill, Rose Bard, Michael Breen, and also, curiously enough Richards and Rogers. It comes to him, like an epiphany, that if there’s one thing we can learn from all these 1970s methods which have passed away, its that there’s something undeniably right about learner-centred classroom teaching. It’s much more demanding, it’s much easier to make a mess of, but, in the end, it’s the only way to go. Ever since her dad, the great Michael Breen, presented her to the public in 1984, Prosylla’s been insisting that learners should call the tune, while the billion dollar coursebook industry has made sure that everybody who wants to go to the ELT ball dances to the robot product beat, and Prosylla’s effectively sidelined. She’s been waiting for a really strong partner to take her to the ball and now Tasba’s had this flash, he’s fed up with bossy partners leading him through the same old predictable, lock-step shuffles, so he’s tweeted that he’s ready to ditch teachers and tie the knot with Prosylla. The prenuptials make it clear that Tasba will hand over the choice and sequencing of his tasks to Prosylla, ensuring that the learners, not the teachers will own their classrooms. I feel a song coming on. “There may be trouble ahead, But while there’s moonlight and music, And love and romance, Let’s face the music and dance.

Sociocultural and Sociocognitive Approaches to Second Language Learning and Teaching: Why Bother?


My last post, “Bridging the Gap”, summarised an article which addressed the question of how to bridge the gap between two different styles of research into second language learning and teaching. On one side are those researching linguistic-cognitive issues, using quantitative research methods and statistical analysis, and on the other side are researchers working on the basis of sociocultural or sociocognitive views, using qualitative research methods including case studies and ethnography. Despite the dramatic claim that the whole community is in danger of disintegrating, no post on this blog has ever aroused as little interest as this one: just about nobody read it, nobody “liked” it, and only Mark left a comment. So rather than comment on the differences between two different styles of research, I’d like to comment on differences between the achievements of the two sides.

First, cognitive research in SLA. Learning English as a second language is usually seen as involving the acquisition of two kinds of knowledge: declarative knowledge, which involves knowing that something is the case – that Paris is the capital of France, for example – and procedural knowledge, which involves knowing how to do something – how to swim, for example. A similar distinction is made between explicit and implicit learning. A “traditional” information-processing model of SLA suggests that first you learn declarative knowledge (the past of “go” is “went”; “see you later” means “goodbye”) through attention-demanding controlled processes, and then, through practice, you transform it into implicit (procedural) knowledge. The teaching implication of this model is presentation and practice. A more recent view of SLA sees it as a process involving the development and restructuring of learners’ mental representation of the target language: their interlanguage. The construct “interlanguage” is used in a theory which sees explicit and implicit learning differently, and answers the question “Why don’t learners learn what teachers teach?” The answer is that interlanguage development involves the acquisition of the L2 in a more or less fixed order, which is impervious to instruction. Many would say that the shift from presentation and practice to a more task-based approach to ELT represents progress, and that it’s the result of a better understanding of SLA.

Regardless of what methodology they adopt, or what epistemological views they hold, how have researchers working on the basis of sociocultural or sociocognitive views contributed to improving our understanding of second language learning and teaching?

The first candidate is “the ethnography of communication”, which studies “the social roles of languages, in structuring the identities of individuals and the culture of entire communities and societies.” (Mitchell and Myles (1998: 164). Examples of studies of speech events are phone conversations, shopping, and job interviews, and themes dealt with include “gatekeeping and power relations in L2 communication”, and “speakers’ social identity, face and self-esteem”. I know of nothing in this area which has helped explain SLA, or helped teachers. Nothing.

How about variable competence models? These take a sociolinguistic approach to SLA, abandoning Chomsky’s distinction between competence and performance and viewing competence as variable, not homogenous. I won’t bother to outline these theories here; it’s enough to note that they both make use of the construct of interlanguage without providing any explanation of the acquisition of linguistic knowledge. By erasing the distinction between competence and performance, “the variabilist is committed to the unprincipled collection of an uncontrolled mass of data” (Gregg 1990: 378). What’s interesting here (in light of the “Bridging the Gap” article) is that Tarone (1990), in reply to Gregg, labels Gregg’s approach “rationalist”, and complains that “such scholars, perhaps motivated by “physics envy”, are trying to turn the study of language into an exact science” (Tarone, 1990: 395).

Just as bad, IMHO is Schumann’s (1978) Acculturalisation/Pidginisation approach, which claims that SLA is “just one aspect of acculturation” and that the more a learner acculturates to the target language group, the better he will acquire the second language”. The two essential problems of this approach are the use of ill-defined key constructs (social and psychological distance), and the unwarranted, and unsupported assumption that L2 users make use of a “simplified grammar”. It has led nowhere.

Finally, attempts to explain “incompleteness” in SLA investigate, among other things, aptitude and motivation. Sawyer and Ranta (2001) suggest that “the clearest fact about SLA that we currently have” is that L2 learners “differ dramatically in their rates of acquisition and in their ultimate attainment” (Sawyer and Ranta, 2001: 319). Unfortunately, as Sawyer and Ranta admit, despite its importance, L2 research into the sources of individual differences has lagged far behind research in other areas. The problem is partly due, as Sawyer and Ranta say, to the reliance on correlational research designs, and partly to the inherent difficulty of finding reliable and valid measures of the traits examined. Saywer and Ranta (2001) have attempted to revitalise Carroll’s work (1974) on aptitude, and Dörnyei and Ushioda (2009) have done something to rectify the problems with Gardner’s work (1985) on motivation, but, yet again, I suggest that progress is very slow.

What some contributors to the Bridging the Gap argued was that the two sides should get together, but it’s a tall order. DeKeyser says that the combination of longitudinal research and mixed-methods research “is largely unheard of” and most acknowledge a fair degree of incomensurability. So why bother? I’m being deliberately provocative; I can see the value of case studies and of small-scale studies (like those done by MA students) using qualitative methods to help validity through so-called triangulation, but I honestly can’t see the value of most of the work done by those in the sociocultural or sociocognitive camp. I suggest that those researchers working in the area of psycholinguistics who take what Hulstijn and I, following Popper, refer to as a critical rationalist approach really don’t need to work with those trying to articulate how members of “a thought collective” are affected by different “thought styles”.

Just in case I’ve given the impression that those taking a cognitive-linguistic approach are all working nicely together, I should quickly say that there is, of course, an awful lot of disagreement within the camp. While most accept that teaching can affect rate but not route, some say that explicit learning plays a major role while others say it plays an extremely limited role; some say it’s a process where controlled processes become automatic through practice, while others say you just get better at accessing declarative knowledge; some say you just need lots of input, others say you need output too; some say L2 adult learners have access to UG, some say they don’t; some say language is learned in a special way, others say it can be explained by general learning theories. But they at least agree on the best way to do their research, namely by testing hypotheses through the use of empirical data.


Dörnyei, Z., & Ushioda, E. (Eds.) (2009) Motivation, language identity and the L2 self. Bristol: Multilingual Matters.
Mitchell, R. and Myles, F. (1998) Second Language Learning Theories. London: Arnold.
Sawyer, M., and Ranta, L. (2001) Aptitude, individual differences, and instructional design. In Robinson, P. Cognition and Second Language Instruction. Cambridge: CUP.
Schumann, J. (1978) The pidginization process: A model for SLA. Rowley: MA: Newbury House.
Tarone, E. (1983) On the variability of interlanguage systems. Applied Linguistics 4,2,
Tarone, E. (1990) On variation in interlanguage: a response to Gregg. Applied Linguistics 11,1, 392-400.