Harold R. Keables

This blog has two aims.

1. To provide those doing a postgraduate course in Applied Linguistics and TESOL with a forum, where issues related to their studies are discussed and some extra materials provided. It is completely independent, and has no support or connections with any university. Let me make these preliminary remarks:

Academics teach and do research. Most of them prefer research to teaching and they haven’t been taught how to teach. So in tertiary education, teaching methodology matters little: it’s the student who counts. The students who go to the best universities are carefully selected, and a key criterion in the selection process is the student’s ability to study without spoon-feeding. A good student does her own studying and knows how to draw on the resources offered. When you sign up for a post-graduate course know that you are in charge and that you, and you alone, will determine the outcome. Your tutor is an expert, not, usually, a teacher. Your job is to use your tutor’s expertise, which means asking the right questions. Don’t ask “What should I do?”, or “Please suggest a topic”. Ask for comments on your own drafts, ask for guidance on reading; ask for clarification. Get into a dialogue with your tutor; shoot the breeze; get familiar; build a relationship, but remember: your tutor is your mentor in the Greek sense of the word, not your teacher.

2. To question the ELT Establishment

The increasing commercialisation of ELT and the corresponding weakening of genuinely educational concerns has resulted in most teachers being forced to teach in a way that shows scant regard for their worth, their training, their opinions, their job satisfaction, or the use of appropriate methods and materials. This is, in my opinion, a disgraceful state of affairs, and one which teachers need to become more aware of.

The biggest single obstacle to good ELT is the coursebook, which forces teachers to work within a framework where students are led through successive units of the book, spending too much time working on isolated linguistic structures and carefully-controlled vocabulary in a sequence which is externally predetermined and imposed on them by the textbook writer. These best-selling, globally-marketed coursebooks (and their attendant teacher books, workbooks, audio, video multimedia and web-based material) have huge promotional budgets aimed at persuading stakeholders in the ELT business that they represent the best practical way to teach English as a second or foreign language. Part of this budget is spent on sponsoring teaching conferences like TESOL International, IATEFL and all the national conferences, where the stars of the ELT world strut their stuff, and, loathe to bite the hand that feeds them, refrain from any serious criticism of the current teaching orthodoxy neatly packaged into shiny coursebooks.

In the last 50 years, studies into SLA have provided supporting evidence for the theory that SLA is a process whereby the learner’s interlanguage (a dynamic, idiosyncratic, evolving linguistic system approximating to the target language) develops as a result of attempts to communicate in the target language. The research suggests that interlanguage development progresses in stages and that it’s impossible to alter stage order or to make learners skip stages. Thus, teachability is constrained by learnability and any coursebook-driven syllabus which attempts to impose an external linguistic syllabus on learners is futile: learning happens in spite of and not because of the course design.

So this blog sets out to question the establishment and the status quo by challenging the role of coursebooks, by being critical of the so-called experts and leaders of the ELT industry – the textbook writers, teacher trainers and examiners; and by promoting the ideas of all those who are trying to buck the trend.

Chomsky’s Critics 2: Elizabeth Bates


Elizabeth Bates (1947 – 2003) was a brilliant scholar perhaps best known for her work with Brian MacWhinney on the Competition Model and Connectionism. In her often outspoken work, Bates challenges the modular theory of mind and, more specifically, criticises the nativists’ use of accounts of “language savants” and those suffering from cognitive or language impairment disabilities to support their theory.  Specifically, in her review of Smith and Tsimpli’s The mind of a savant , Bates (2000) challenges the authors’ conclusions about Christopher, the savant in question, and, along the way, challenges the two main arguments supporting the UG “ideology”, as she calls it: the existence of universal properties of language, and the poverty of the stimulus.

First, the existence of language universals does not provide compelling evidence for the innateness of language, because such universals could arise for a variety of reasons that are not specific to language itself (e.g., universal properties of cognition, memory, perception, and attention).  (Bates, 2000: 5)

Bates, following Halliday, gives the analogy of eating food with ones’ hands (with or without tools like a fork or a chopstick), which can be said to be universal. Rather than posit “an innate hand-feeding module, subserved by a hand-feeding gene”, a simpler explanation is that, given the structure of the human hand, the position of the mouth, and the nature of the food we eat, this is the best solution to the problem.

In the same vein, we may view language as the solution (or class of solutions) to a difficult and idiosyncratic problem: how to map a rich high-dimensional meaning space onto a low-dimensional channel under heavy information-processing constraints, guaranteeing that the sender and the receiver of the message will end up with approximately the same high-dimensional meaning state.  Given the size and complexity of this constraint satisfaction problem, the class of solutions may be very small, and (unlike the hand-feeding example) not at all transparent from an a priori examination of the problem itself  (Bates, 2000: 5).

Bates gives other examples to support her argument that solutions to particular problems of perception and cognition often evolve in an ad hoc way, and that there is no need no jump to the convenient conclusion that the problem was solved by nature.  As she says “That which is inevitable does not have to be innate!”  (Bates, 2000:  6)

Bates sees language as consisting of a network, or set of networks, and she was one of the first to begin work on a connectionist model, known now as the Competition Model. She’s refreshingly frank in recognising that neural network simulations of learning are still in their infancy, and that it’s still not clear how much of human language learning such systems will be able to capture. Nevertheless, she says, the neural network systems which have already been constructed are able to generalise beyond the data and recover from error. “The point is, simply,” says Bates, “that the case for the unlearnability of language has not been settled one way or the other” (Bates, 2000: 6).

Bates goes on to say that when the nativists point to the “long list of detailed and idiosyncratic properties” described by UG, and ask how these could possibly have been learned, this begs the question of whether UG is a correct description of the human language faculty.  Bates paraphrases their argument as follows:

  1. English has property P.
  2. UG describes this property of English with Construct P’.
  3. Children who are exposed to English, eventually display the ability to comprehend and produce English sentences containing property P.
  4. Therefore English children can be said to know Construct P’.

Bates comments:

There is, of course, another possibility: Children derive Property P from the input, and Construct P’ has nothing to do with it. (Bates, 2000: 6)

An important criticism raised by many, and taken up by Bates, against Chomsky’s theory is that it is difficult to test. In principle, one of the strong points of UG is precisely its empirical testability – find a natural language where the description does not fit, or find a mature language user of a natural language who judges an ill-formed sentence to be grammatical, and you have counter-evidence. However, Bates argues that the introduction of parameters and parameter settings “serve to insulate UG from a rigorous empirical test.” In the case of binary universals (e.g., the Null Subject Parameter), any language either will or will not display them, they “exhaust the set of logical possibilities and cannot be disproven.” Other universals are allowed to be silent or unexpressed if a language does not offer the features to which these universals apply. For example universal constraints on inflectional morphology cannot be applied in Chinese, since Chinese has no inflectional morphology. Rather than allow Chinese to serve as a counter example to the universal, the apparent anomaly is resolved by saying that the universal is present but silent. Bates comments: “It is difficult to disprove a theory that permits invisible entities with no causal consequences.



1. Poverty of the Stimulus

Many of the criticisms made by Sampson and Bates do not seem to me to be well-founded.  While Bates is obviously correct to say that language universals could arise for a variety of reasons that are not specific to language itself, Bates provides no evidence against Chomsky’s claims. To say that “the case for the unlearnability of language has not been settled” amounts to the admission that no damning evidence has yet been found against the poverty of the stimulus argument, and, of course, such an argument can never be “proved”.

In general, to suggest that learning a language is just one more problem-solving task that the general learning machinery of the brain takes care of ignores all the empirical evidence of those adults who attempt and fail to learn a second language, and the evidence of atypical populations who successfully learn their L1.  Despite Bates’ careful and convincing unpicking of the more strident claims made by nativists in their accounts of atypical populations, it’s hard to explain the cases of those with impaired general intelligence who have exceptional linguistic ability (see Smith, 1999: 24), or the cases of those with normal intelligence who, after a stroke, lose their language ability while retaining other intellectual functions (see Smith 1999: 24-29), if language learning is not in fact localised.

Turning to Sampson, when he challenges Chomsky’s poverty of the stimulus argument by saying that many children have in fact been subjected to input like Blake’s Tyger poem, he ignores the obvious fact that many children have not, and when he says that children need input of yes/no questions in order to learn how to form them, nobody would disagree; the question remains of how the child also learns about aspects of the grammar that are not present in the input. In my recent discussion with Scott about the poverty of the stimulus argument, he claimed, as does Sampson, that “everything the child needs” is, in fact, present in the input, and thus no resort to nativist arguments of modular mind, innate knowledge, the LAD, or any of that, is necessary. While Sampson attempts, bizarrely and without success, to use Popper’s arguments for progress in science through conjectures and refutations as a model for language acquisition, I think Scott was relying more on the kind of emergentist theory of learning that Bates has promoted. But, in my opinion, only Bates shows any appreciation for just how hard it is to do without any appeal to innateness. Let’s take a quick look.


Nativism vs. Emergentism

Gregg (2003) highlights the differences between the two approaches. On the one hand, he says, we have Chomsky’s theory which posits a rich, innate representational system specific to the language faculty, and non-associative mechanisms, as well as associative ones, for bringing that system to bear on input to create a grammar. On the other hand, we have the emergentist position, which denies both the innateness of linguistic representations  and the domain-specificity of language learning mechanisms.

Starting from the premise that items in the mind get there through experience, emergentists adopt a form of associationism and argue that items that go together in experience will go together in thought. If two items are paired with sufficient frequency in the environment, they will go together in the mind.  In this way we learn that milk is white,  -ed is the past tenser marker for English verbs, and so on. Associationism shares the general empiricist view that complex ideas are constructed from simple “ideas”, which in turn are derived from sensations caused by interaction with the outside world. Gregg (2003) acknowledges that these days one certainly can model associative learning processes with connectionist networks, but he highlights the severe limitations of connectionist models by examining the Ellis and Schmidt model (see Gregg, 2003: 58 – 66) in order to emphasise just how little the model has learned and how much is left unexplained.  Re-reading the 2003 article makes me wonder if Scott and others who dismiss innateness as an explanation appreciate the sheer implausibility of a project which does without it. How can emergentists seriously propose that the complexity of language emerges from simple cognitive processes being exposed to frequently co-occurring items in the environment?


And so we return to the root of the problem of any empiricist account: the poverty of the stimulus argument.  Emergentists, by adopting an associative learning model and an empiricist epistemology, where some kind of innate architecture is allowed, but not innate knowledge, and certainly not innate linguistic representations, have a very difficult job explaining how children come to have the linguistic knowledge they do. They haven’t managed to explain how general conceptual representations acting on stimuli from the environment produce the representational system of language that children demonstrate, or to explain how, as Eubank and Gregg put it “children know which form-function pairings are possible in human-language grammars and which are not, regardless of exposure” (Eubank and Gregg, 2002: 238). Neither have emergentists so far dealt with “knowledge that comes about in the absence of exposure (i.e., a frequency of zero) including knowledge of what is not possible” (Eubank and Gregg, 2002: 238).

I gave Vivian Cook’s version of the PoS argument in Part 1, but let me here give  Gregg’s  summary of Laurence and Margolis’ (2001: 221) “lucid formulation”:

  1. An indefinite number of alternative sets of principles are consistent with the regularities found in the primary linguistic data.
  2. The correct set of principles need not be (and typically is not) in any pretheoretic sense simpler or more natural than the alternatives.
  3. The data that would be needed for choosing among those sets of principles are in many cases not the sort of data that are available to an empiricist learner.
  4. So if children were empiricist learners they could not reliably arrive at the correct grammar for their language.
  5. Children do reliably arrive at the correct grammar for their language.
  6. Therefore children are not empiricist learners (Gregg, 2003: 48).

To the extent that the emergentists insist on a strict empiricist epistemology, they’ll find it extremely difficult to provide any causal explanation of language acquisition, or, more relevant to us, of SLA. Combining observed frequency effects with the power law of practice, for example, and thus explaining acquisition order by appealing to frequency in the input doesn’t go far in explaining the acquisition process itself.  What role do frequency effects have, how do they interact with other aspects of the SLA process?  In other words, we need to know how frequency effects fit into a theory of SLA, because frequency and the power law of practice don’t provide a sufficient theoretical framework in themselves. Neither does connectionism; as Gregg points out “connectionism itself is not a theory….. It is a method, and one that in principle is neutral as to the kind of theory to which it is applied” (Gregg, 2003: 55).


 2. Idealisation

There is also the question of idealisation, stressed by Sampson in his criticisms, and probably the most frequently-expressed objection made to UG. The assumption Chomsky makes of instantaneous acquisition, like the idealisation of the “ideal speaker-listener in a completely homogenous speech-community”, is a perfectly respectable tool used in theory construction: it amounts to no more than the “ceteris paribus” argument that allows “all other things to be equal” so that we can isolate and thus better examine the phenomenon in question. Idealisations are warranted because they help focus on the important issues, and to get rid of distractions, which does not mean that this step is immune to criticism, of course. It’s up to Chomsky to make sure that any theories based on idealizations are open to empirical tests, and it is then up to those who disagree with Chomsky to come up with some counter evidence and/or to show that the idealisation in question has protected the theory from the influence of an important factor.  Thus, if Sampson wants to challenge Chomsky’s instantaneous acquisition assumption, he will have to show that there are differences in the stages of people’s language acquisition which result in significant differences in the end state of their linguistic knowledge.

While on the subject of idealisations, we may deal with the criticism of sociolinguists who challenge Chomsky’s idealisation to a homogenous speech community by saying that Chomsky is ruling out of court any discussion of variations within a community.  Chomsky would reply that he’s doing no such thing, and that if anybody is interested in studying such variations they are welcome to do so.  Chomsky’s opinion of the scant possibility of progress in such an investigation is well-known, but he of course admits that it’s  only an opinion. What Chomsky is interested in, however, is the language faculty, and the acquisition of a certain type of well-defined knowledge. In order to better investigate this domain, Chomsky idealises the speech community.  Sociolinguists can either produce arguments and data which show that such an idealization is illegitimate (i.e. that it isolates part of the theory from the influence of a significant factor), or say that they are interested in a completely different domain.  It seems to be often the case that criticisms of Chomsky arise from misunderstandings about the role of idealisations in theory construction, or about the domain of a theory.


Weaknesses of UG theory

Chomsky’s theory runs into difficulties in confronting the question of how UG evolves, and how the principles and parameters arrive at a stable state in a normal child’s development.  Furthermore, there’s  no doubt that the constant re-formulation of UG results in “moving the goal points” and protecting the theory from bad empirical evidence by the use of ad hoc hypotheses.

And we shouldn’t forget that when we discuss UG we have the “principles and parameters” theory in mind, and not the “Minimalist” programme, let alone Internalism. Internalism sees Chomsky insisting that the domain of his theory is not grammar but “I-language”, where “I” is “Internal” and where “Internal” means in the mind. While exposure to external stimuli is necessary for language acquisition, Chomsky maintains that, as Smith puts it “the resulting system is one which has no direct connection with the external world” (Smith, 1999: 138). This highly counter-intuitive claim takes us into the technicalities of a philosophical debate about semantics in general and “reference” in particular, where Chomsky holds the controversial view that semantic relations “are nothing to do with things in the world, but are relations between mental representations: they are entirely inside the head”  (Smith, 1999: 167).  Perhaps the most well-known example of this view is Chomsky’s assertion that while we may use the word “London” to refer to the capital city of the UK, it’s unjustified to claim that the word itself refers to some real entity in the world.  Go figure, as they say.

But the most important criticism I personally have of UG is that it is too strict and too narrow to be of much use to those trying to build a theory of SLA. I think it’s important to challenge Chomsky’s claim that questions about language use “lie beyond the reach of our minds”, and that they “will never be incorporated within explanatory theories intelligible to humans” (Chomsky, 1978).  Despite Chomsky’s assertion, I think we may assume that the L2 acquisition process is capable of being rationally and thoroughly examined.  Further, I suggest that it need not be, indeed should not be, idealised as an instantaneous event, which is to say, I assume that we can ask rational questions about the stages of development of interlanguages, that we can study the real-time processing required to understand and produce utterances in the L2, that we can talk about not just the acquisition of abstract principles but of skills, and even that we can study how different social environments affect SLA.

By insisting on a “scientific” status for his theory, Chomsky severely limits its domain, and to appreciate just how limited the domain of UG is, let us remind ourselves of Chomsky’s position on modularity.  Chomsky argues that in the human mind there is a language faculty, or grammar module, which is responsible for grammatical knowledge, and that other modules handle other kinds of knowledge. Not all of what is commonly referred to as “language” is the domain of the language module; certain parts of peripheral grammatical knowledge, and all pragmatic knowledge, are excluded. To put it another way, the domain of Chomsky’s theory is restricted by his distinction between I-language and E-language; Chomsky is concerned with the individual human capacity for language, and with the universal similarities between languages – his domain deliberately excludes the community. No justification needs to be offered for deciding to focus on a particular phenomenon or a particular hypothesis, but it is essential to grasp the domain of Chomsky’s theory.  Cook (1994) puts it this way:

Chomskian theory claims that, strictly speaking, the mind does not know languages but grammars; ‘the notion “language” itself is derivative and relatively unimportant’ (Chomsky, 1980, p. 126).  “The English Language” or “the French language” means language as a social phenomenon – a collection of utterances.  What the individual mind knows is not a language in this sense, but a grammar with the parameters set to particular values.  Language is another epiphenomenon: the psychological reality is the grammar that a speaker knows, not a language (Cook, 1994: 480).

Gregg (1996) has this to say:

… “language” does not refer to a natural kind, and hence does not constitute an object for scientific investigation.  The scientific study of language or language acquisition requires the narrowing down of the domain of investigation, a carving of nature at its joints, as Plato put it. From such a perspective, modularity makes eminent sense (Gregg, 1996: 1).

Chomsky himself says that what he seeks to describe and explain is

The cognitive state that encompasses all those aspects of form and meaning and their relation, including underlying structures that enter into that relation, which are properly assigned to the specific subsystem of the human mind that relates representations of form and meaning. A bit misleadingly perhaps, I will continue to call this subsystem ‘the language faculty’ (Chomsky 1980).

Pragmatic competence, on the other hand, is left out because

there is no promising approach to the normal creative use of language, or to other rule-governed acts that are freely undertaken…..  the creative use of language is a mystery that eludes our intellectual grasp (Chomsky, 1980).

Chomsky would obviously agree that syntax provides no more than clues about the content of any particular message that someone might try to communicate, and that pragmatics takes these clues and interprets them according to their context.  If one is interested in communication, then pragmatics is vital, but if one is interested in language as a code linking representations of sound and meaning, then it is not.  Chomsky’s strict demarcation between science and non-science effectively rules out the study of E-Language, and consequently his theory neither describes nor explains many of the phenomena that interest linguists. Far less does UG describe or explain the phenomena of SLA. By denying the usefulness of attempts to explain aspects of language use and usage that fall outside the domain of I-Language, UG  can’t be taken as the only valid frame of reference for SLA research and theory construction, or even as a good model.



Bates, E. (2000) Language Savants and The Structure of The Mind.  International Journal of Bilingualism. 

Bates, E.; Elman, J.; Johnson, M.; Karmiloff-Smith, A.; Parisi, D.; and Plunkett, K. (1998) Innateness and Emergentism.  In Bechtel, W., and Graham, G., (eds) A Companion to Cognitive Science. 590-601. Oxford: Basil Blackwell.

Bates, E. and Goodman, J. (1997) On the inseparability of grammar and the lexicon: evidence from apasia, acquisition and real-time processing.  Language and Cognitive Processes, 12 , 507-584.

Chomsky, N. (1980) Rules and representations. Oxford: Basil Blackwell.

Cook, V. J. (1994) The Metaphor of Access to Universal Grammar in L2 Learning.  In Ellis, N. (ed.)  Implicit and Explicit Learning of Languages.  London: Academic Press.

Gregg, K. R. (1996) The logical and developmental problems of second language acquisition.  In Ritchie, W.C. and Bhatia, T.K. (eds.) Handbook of second language acquisition.  San Diego: Academic Press.

Gregg, K. R. (2000) A theory for every occasion: postmodernism and SLA.  Second Language Research 16, 4, 34-59.

Gregg, K. R. (2003) The state of emergentism in second language acquisition.  Second Language Research, 19, 2, 42-75.

Laurence, S. and Margolis, E. (2001) The Poverty of the Stimulus Argument. British Journal for the Philosophy of Science, Vol. 52, 3.

Smith, N. (1999) Chomsky: Ideas and Ideals.  Cambridge: Cambridge University Press.

Smith, N., & Tsimpli, I-M. (1995). The mind of a savant: Language learning and modularity. Oxford: Basil Blackwell.

British Jnl. for the Philosophy of Sci.Volume 52, Issue 2 Pp. 217-276.

Chomsky’s Critics 1. Sampson


Scott Thornbury’s latest Sunday post gave what I thought was a very poor account of the poverty of the stimulus argument and of objections to it.  While Scott was quite measured in his original remarks, his post showed a spectacular disregard for logic, and the wave of enthusiastic messages of support which flooded in from a frightening array of dimwits and cranks seemed to unhinge our normally restrained hero, provoking him to ever more outrageous and fanciful claims. I and a couple of other sensitive souls did our modest best to keep him on the rails, but we failed, the wheels came off, and last time I looked, the whole crazy bunch of them were swapping quotes from Derrida, counting backwards from 666, trying to communicate with each other without switching their brains on, and using impoverished input devices like the Microsoft keyboard. Since they’ve all shown themselves to be useless at marshalling a case against Chomsky for themselves, I thought I’d offer a helping hand. I’m all heart, really.  So here’s the case against Chomsky as argued by two of his leading critics: Geoffrey Sampson and Elizabeth Bates.

Before we start on Sampson, let’s quickly state the poverty of the stimulus argument. It says: since children know things about language that they’ve never been exposed to, that knowledge must be innate. Vivian Cook puts it like this:

Step A. A native speaker of a particular language knows a particular aspect of syntax.

Step B. This aspect of syntax could not have been acquired from language input. This involves considering all possible sources of evidence in the language the child hears and in the processes of interaction with parents.

Step C. This aspect of syntax is not learnt from outside. If all the types of evidence considered in Step B can be eliminated, the logical inference is that the source of this knowledge is not outside the child’s mind.

Step D. This aspect of syntax is built-in to the mind (Cook, 1991).

The UG argument is that all natural languages share the same underlying structure, and the knowledge of this structure is innate.

Sampson says that Chomsky’s claims about the linguistic data available to the child  are “untrue”, and he takes Chomsky’s example (used at the famous 1975 conference at Royaumont, where Piaget, Chomsky, Fodor, and others gathered to discuss the limitations of the genetic contribution to culture) of two different hypotheses about the grammar of yes/no questions in English. Turning an English statement into the corresponding yes/no question involves operating on a finite verb in the statement. Either the verb itself is moved to the left (if the verb is a form of be, do, have, or a modal verb such as will) – thus ‘The man is tall’ becomes ‘Is the man tall?’; or, in all other cases the verb is put into the infinitive and an inflected form of do is placed to the left – thus ‘The man swims well’ becomes ‘Does the man swim well?’  (Sampson, 1997: 40).

Chomsky says there are two hypotheses that the child learning English might try:  1. operate on the first finite verb;  2. operate on the finite verb of the main clause.  Hypothesis 1 violates the structure dependence universal and is false (applied to the sentence “The man who is tall is sad.”, it would give: “Is the man who tall is sad?”).  Hypothesis 2 is correct. Yet both hypotheses work in all questions except those formed from statements containing a subordinate clause which precedes the main verb.  The child cannot decide by observation whether one or the other hypothesis is true, because cases of statements containing a subordinate clause which precedes the main verb are extremely rare. Therefore, the child decides on the basis of innate knowledge. In reply to this Sampson says that many examples actually exist, including the well-known line from Blake’s The Tyger “Did he who made the Lamb make thee?”  Sampson goes on to give a number of other examples from a children’s corpus, and concludes:

Since Chomsky has never backed up his arguments from poverty of the child’s data with detailed empirical studies, we are entitled to reject them on the ground that the data available to the child are far richer than Chomsky supposes.  (Sampson, 1997: 42)


Sampson then attacks Chomsky’s “question-begging idealizations”.  Chomsky distinguishes between competence (a certain type of knowledge which is the phenomenon that he wants to explain), and performance (data, much of which he judges to be irrelevant). To examine competence, Chomsky argues that it’s necessary to make various simplifying assumptions, but Sampson claims that Chomsky’s use of simplifications distorts the substantial point at issue.  Each of the counterfactual simplifying assumptions about human language which Chomsky makes “eliminates a plausible alternative from consideration through what is presented as a harmless, uncontroversial assumption” (Sampson, 1997: 51).  Sampson gives the example of the assumption that language acquisition is an instantaneous process. This, says Chomsky, is “a harmless assumption, for if it mattered then we would expect to find substantial differences in the result of language learning depending on such factors as order of presentation of data, time of presentation, and so on.  But we do not find this” (Chomsky, cited in Sampson, 1997: 51-52). Sampson replies that language acquisition is not an instantaneous process (as Chomsky elsewhere admits), and it is not a harmless simplification to say that it is. As Sampson says:

To claim that it is harmless to pretend that language acquisition is instantaneous is, in effect, to assume that language acquisition does not work in a Popperian fashion, without going to the trouble of arguing the point.  (Sampson, 1997: 52)

Chomsky acknowledges that children do not move from ignorance to mastery of language instantaneously, but he insists that “fairly early in life” a child’s linguistic competence reaches a “steady state”, after which there are no significant changes.  Sampson points out, however, that this “steady state” idea is contested by Bloomfield and Whitney (both of whom see language learning as a lifelong process), and is also completely at odds with the Popperian approach to learning, which brings us to Sampson’s alternative explanation of language acquisition.


Sampson argues that the essential feature of languages is their hierarchical structure.  Children start with relatively crude systems of verbal communication, and gradually extended syntactic structures in a pragmatic way so as to allow them to express more ideas in a more sophisticated way.  The way they build up the syntax is piecemeal; they concentrate on assembling a particular part of the system from individual components, and then put together the subassemblies. This gives them low level structures which are then combined, with modifications on the basis of input, into higher level structures, and so on.

Sampson uses the Watchmaker parable, first made by Herbert Simon (see Sampson, 1997:111-113), to explain linguistic development.  I won’t go into it here, but Sampson says that Simon’s parable shows that “complex entities produced by any process of unplanned evolution, such as the Darwinian process of biological evolution, will have tree-structuring as a matter of statistical necessity” (Sampson, 1997: 113). Furthermore, in Sampson’s view, “the development of knowledge, as Popper describes it, is a clear case of the type of evolutionary process to which Simon’s argument applies, and can be applied to syntactic structures”.  Sampson describes how the communication system of our ancestors gradually became more complex as language learners made longer sentences, which would enter the language if they made a significant enough contribution to transmitting information more economically, or if they were semantically innovative.  Similarly, a child acquires language by composing sub-assemblies from individual components, and then putting together the sub-assemblies.



Only a general learning theory is involved in Sampson’s explanation, which adopts a decidedly Popperian approach. The child tests various hypotheses about grammaticality against input, and slowly builds up the right hierarchically structured language by following a Popperian programme of conjectures and refutations. This supposes, of course, that the child is exposed to adequate input.  Sampson’s argument has two main strands: first, following Simon, gradual evolutionary processes have a strong tendency to produce tree structures; and second, following Popper, knowledge develops in a conjectures-and-refutations evolutionary way.  Sampson claims that these two strands are enough to explain language acquisition.

Perhaps Sampson’s criticism of one of Chomsky’s most central assumptions can serve to highlight the differences between them.  Chomsky says that

Linguistic theory is concerned primarily with an ideal speaker-listener, in a completely homogenous speech community, who knows its language perfectly (Chomsky, cited in Sampson, 1997: 53).

This assumption, which Chomsky describes as being of “critical importance” for his theory, excludes Sampson’s Popperian approach without even considering it.  For Sampson, learning is a “non-terminating process”, and language has no independent existence over and above the representations of the language in the minds of the various individuals belonging to the speech community that uses it.

What the language learner is trying to bring his tacit linguistic theory into correspondence with is not some simple, consistent grammar inhering in a collective national psyche…. Rather, he is trying to reconstruct a system underlying the usage of the various speakers to whom he is exposed; and these speakers will almost certainly be working at any given time with non-identical tacit theories of their own – so that there will not be any wholly coherent and unrefutable grammar available to be formulated.  The notion of a speaker-listener knowing the language of his community “perfectly” is doubly inapplicable – both because there is no particular grammar, achievement of which would count as “perfect” mastery of the language, and because even if there were such a grammar, there is no procedure by which a learner could discover it.  (Sampson, 1997: 53-54)

From Sampson’s Popperian perspective, even if language learners were “ideal” they would not attain “perfect” mastery of the language of the community.  As Sampson says:

Popperian learning is not an algorithm which, if followed without deviation, leads to a successful conclusion.  Therefore, to assume that it makes sense to describe an “ideal” speaker-listener as inhabiting a perfectly homogenous speech community and as knowing its language perfectly amounts, once again, to surreptitiously ruling the Popperian view of acquisition out of consideration. (Sampson, 1997: 55)

I personally don’t find Sampson’s arguments persuasive, and I’ll explain why after I’ve presented Bates’  case against Chomsky in the next post.


Cook, V. J. (1991) The poverty-of-the-stimulus argument and multi-competence.  Second Language Research, 7,2, 103-117

Sampson, G. (1999)  Educating Eve: the `language instinct’ debate. London: Cassell.

Interlanguages 2


Understanding interlanguage development helps in evaluating different approaches to ELT.  I’ve already touched on this issue in a post on TBLT,  and in Challenging Coursebooks 2, and here’s a bit more, intended as further support for my criticisms of coursebooks, and as preparation for a syllabus proposal. This is mostly a cut-and-paste paraphrasing of Long, 2011.

We must start by recognizing that learners, not teachers, have most control over their language development.  As Long (2011) says:

Students do not – in fact, cannot – learn (as opposed to learn about) target forms and structures on demand, when and how a teacher or a coursebook decree that they should, but only when they are developmentally ready to do so. Instruction can facilitate development, but needs to be provided with respect for, and in harmony with, the learner’s powerful cognitive contribution to the acquisition process.

A major source of evidence for the strength of the learner’s role in SLA, and simultaneously, about the limits of instruction, is the work that’s been done on processes in interlanguage development. Interlanguages (the construct was introduced by Selinker in 1972) are individual learners’ transitional versions of the L2, and studies show that they exhibit common patterns and features across differences in learners’ age and L1, acquisition context, and instructional approach. Independent of those and other factors, learners pass through well-attested developmental sequences on their way to mastery of target-language structures, or, as is often the case, to an end-state short of mastery. Examples of such sequences are found in the well known morpheme studies; the four-stage sequence for ESL negation; the six-stage sequence for English relative clauses; and the sequence of question formation in German (see Long, 2015 for a full discussion).

Long (2011) insists that SLA is not a process of forming new habits to override the effects of L1 transfer. Even when presented with, and drilled in, target-language forms and structures, and even when errors are routinely corrected, learners’ acquisition of newly-presented forms and structures is very rarely either categorical or complete, as is assumed by most coursebooks. On the contrary, acquisition of grammatical structures and sub-systems like negation or relative clause formation is typically gradual, incremental and slow, sometimes taking years to accomplish. Development of the L2 exhibits plateaus, occasional movement away from, not toward, the L2, and  U-shaped or zigzag trajectories rather than smooth, linear contours. No matter what the learners’ L1 might be, and no matter what the order or manner in which target-language structures are presented to them by teachers or by coursebook  writers, learners analyze the input and come up with their own interim grammars, the product broadly conforming to developmental sequences observed in naturalistic settings. They master the structures in roughly the same manner and order whether learning in classrooms, on the street, or both. This led Pienemann to formulate his learnability hypothesis and teachability hypothesis: what is processable by students at any time determines what is learnable, and, thereby, what is teachable (Pienemann, 1984, 1989). The effectiveness of negative feedback on error has been shown to be constrained in the same way (see, e.g., Mackey, 1999).

The 5 most studied processes of interlanguage development are

  • simplification (using “la” for “the”, and “un” for “a” in Spanish, regardless of gender, etc.,);
  • overgeneralization (using “ed” for irregular verbs);
  • restructuring (often involving back-sliding: going from “went” to “goed”, but often making adjustments which “improve” the IL);
  • U-shaped behaviour (went –> goed –> went);
  • and fossilization (“premature cessation of development in defiance of optimal learning conditions” Selinker, 1972).

While knowledge about the sequences and processes of interlanguage development should act mostly to warn us against any simple view of teaching and learning an L2, it can, Long says, inform good teaching by helping teachers (and their students) cultivate a different attitude towards errors, and more enlightened expectations for progress. “It can help them recognize that many so-called errors are a healthy sign of learning, that timing is hugely important in language teaching, and that not all that can be logically taught can be learned if learners are not developmentally ready. Knowledge about sequences and processes can also help counter the deficit view that interlanguages are defective surrogates of the target language by making it clear that interlanguages are shaped by the same systematicity and variability that shape all other forms of human language” (Long, 2011). It should also be remembered that if teachers respect the constraints of their learners’ trajectories, and especially if they teach according to the principles referred to below, they can have a dramatic positive effect on their learners’ rate of learning.

The question remains: Why don’t language teachers teach to the sequences and processes which have been identified in interlanguage studies?  First, because we don’t know how different sequences relate to each other in the grammar of individual learners, so we don’t know how to sequence grammatical targets according to developmental learner readiness principles. More importantly, language learning isn’t just learning  grammar: vocabulary, pragmatics, phonology, and so on are also involved. But the most fundamental objection is that learning an L2 isn’t about focusing on bits and pieces of language.  Rather than trying to organize instruction around grammar (or lexical chunks, for that matter) in a product syllabus, implemented by using a General English coursebook, we have a wide range of options which are more attuned to what we know about psycholinguistic, cognitive, and socioeducational principles for good language teaching.  These include Dogme, Task-Based-Language-Teaching, various forms of ESP, and various process syllabuses. All of them share the principles that I’ve outlined in previous posts on TBLT and Principles and Practice and I’ll propose one such syllabus shortly.

Long, M. (2011) “Language Teaching”. In Doughty, C. and Long, M.  Handbook of Language Teaching. NY Routledge.

Long, M. (2015) SLA and TBLT. N.Y., Routledge.

All other references can be found at the end of Long’s 2011 Chapter.



Since my presentation Challenging the Coursebook, there have been various responses.  With the one exception of Andrew Schmidt’s comments, none has dealt with the points I raised.

My argument  against the coursebook is in two parts.  First, most coursebooks assume that  presenting and practicing discrete formal aspects of the language in a pre-determined sequence will lead to declarative knowledge becoming procedural, and that the synthetic bits of language presented and practiced in the coursebook will be accumulated by learners in such a way as to result in the progressive re-structuring of their interlanguages.  Both these assumption are false. The assumption that learners will learn what they’re taught when they’re taught it is also false.

Second, coursebooks impose a product (synthetic) syllabus on users,  but a process (analytic) syllabus  caters better to learners’ needs and is likely to lead to faster learning and higher levels of attainment.

In reply, these comments have been made:

Not all coursebooks are the same: they differ in content and design.  Of course: and there are bound to be exceptions to my generalised assertion.  But apart from the coursebooks Anthony named, nobody else (and in particular, not Dellar) has given any coherent argument against the claim that most coursebooks are based on the false assumptions I attribute to them.

Teachers use coursebooks in very different ways.  Again: of course. But unless teachers use the coursebooks so sparingly or in ways so entirely different from the way the authors intend them to be used, the coursebook is the most important factor in determining what happens in the lessons comprising the course.

Coursebooks help busy, overworked teachers who don’t have time to prepare their own lesson plans and materials. Quite so. But if that’s the only reason to explain why teachers use them, then it follows that ELT would be better if we organised things in such a way that we didn’t rely on coursebooks.

Coursebooks help new teachers who need obvious structure and guidance. Ditto.

Expecting teachers to make their own materials without paying them is worse than asking them to use a coursebook.  Ditto.

Despite all their flaws, I use coursebooks, so there.  I know this is supposed to be funny, or witty, or something, but it’s a bit too near the truth to make me laugh.

I find it depressing that so little importance seems to be given to the underlying principles which inform our teaching practice. Why are most teachers not more concerned about these principles?  Why is there so little attempt made to seriously confront the argument that SLA is a predominantly implicit process where declarative knowledge and explicit instruction is known to play a minor role in facilitating language learning?  Likewise, why are so few people in ELT ready to take seriously the various proposals that have been made for a process syllabus?  Rather than make an attempt to critically appraise the arguments against coursebooks, or to put forward a coherent, principled counter-argument, all we get are excuses. And very lame excuses at that.

Logic Quiz 2


Here are the answers: 

1. Socrates is a philosopher.
All philosophers are poor.
So Socrates is poor.


2. Whenever Anil is here, Kumar is also here.
Anil is not here.
So Kumar is not here.

Invalid  Kumar is not always with Anil. 

3. Most drug addicts are depressed people.
Most depressed people are lonely.
So most drug addicts are lonely.

Invalid  Compare: Some men are doctors; Some doctors are women; Therefore, some men are women.

4. Nothing that is cheap is good.
So nothing that is good is cheap.


5. If there is an earthquake, the detector will send a message.
No message has been sent.
So there was no earthquake.

Valid   There’s no time dimension here.

6. John said that everyone loves Mary.
Nothing that John has said is true.
So nobody loves Mary.

Invalid   Even if it’s not true that everyone loves Mary, we can’t deduce that nobody does.

7. If there is life on Mars, then Mars contains water.
If Mars has ice, it contains water.
There is ice on Mars.
So there is life on Mars.

Invalid.  It would be valid if line 1 said If there is water on Mars, then there is life on Mars.

8. All roses are flowers.
Some flowers fade quickly.
So some roses fade quickly.

Invalid  Roses might not be part of the set in line 2.

9. Our government should either spend less or raise taxes.
Raising taxes is impossible.
So our government should spend less.

Valid A necessary consequence of either / or.

10. If John is guilty, so is Peter.
If Peter is not guilty, Jeremy is.
So if John is not guilty, Jeremy is.

Invalid   “Jeremy is” means Jeremy is guilty in line 2 and Jeremy is not guilty in line 3.   

In the puzzle, Argument 2 is correct

A few more comments:

Such puzzles are interesting for MA students (they can help to prevent bad interpretation of data collected in their small studies) and essential for those who want to argue that one thing does or doesn’t logically imply another.

Logic is to do with valid argument, not truth. Thus, an argument can be valid but not true – i.e. it doesn’t correspond to the facts.  For example: All cats are black; Tibby is a cat; So Tibby is black.

When it comes to evidence, we should be clear that evidence can’t prove a hypothesis is true: you can’t go from the particular to the general. The famous example is “All swans are white” based on the fact that all the swans we’ve ever observed have been white: while the observations support the claim, they don’t prove it. This is Popper’s great contribution to scientific method, although he’s only following Hume’s devastating critique of inductive reasoning. There’s an asymmetry between truth and falsehood: you can prove a theory is false, but you can’t prove it’s true. That’s why Popper says that the role of empirical evidence is to test a theory (to try to falsify it), not to prove it. We don’t know if any of the theories we hold to be true are actually true, but we allow them (through a very interesting process called inference to the best explanation) for as long as they survive tests.

The worst examples of bad reasoning in discussions of SLA or ELT come from either using circular arguments (Krashen’s 5 hypotheses making up the Monitor Theory are the best-know examples) or non sequiturs ( a blanket term covering most fallacies). Circular arguments make it impossible to disprove them and the usual reason here is that the theoretical constructs used in them have no empirical content. As for non sequiturs, these abound in discussions of SLA and ELT, where conclusions drawn from observations simply don’t follow. In my opinion, those working in the area of sociolinguistics, particularly those adopting post-modernist, ethnographic approaches, are particularly prone to drawing sweeping conclusions from scant evidence and from the use of at times absurd theoretical constructs.  The most blatant recent examples of non-sequiturs I’ve seen are Harmer’s claims that exams are good because knowing he had an exam helped his tuba playing, and that the Pearson test of Academic English is reliable because the research behind its voice recognition software is “massive”; Dellar’s claim that my “insults” and not having read his coursebook adversely affect my argument against coursebooks; Scrivener’s claim that coursebooks are good because they’re better than they were; and Mayne’s  suggestion that Chomsky’s theory is thrown into doubt by the fact that Chomsky’s criticism of Skinner is “nasty”.

In my page on Critical Thinking – see the menu on the right – I suggest a few web sites where those interested in following up my remarks could look further.

Finally, by far the most important part of critical thinking is the attitude of suspicion. Suspicion sounds negative, but here it simply means don’t believe what you’re told. Constantly challenge not just the reasoning but also the so-called facts. On the latter point, Russ has  done a great job in the last year of promoting the question “Where’s the evidence?”

There is, of course, a political dimension to all this. Critical discourse analysis (not one of favourite areas of the MA!!) deliberately focuses on the ideologies and power relations involved in discourse, and tries to uncover these properties of texts. Without overdoing things, it’s surely a good idea to appreciate that when you read stuff published by the British Council, or by CUP (promoters of the upcoming, self-congratulatory, Oscar-type-love-in ELTon event – complete with red carpet just to increase the embarrassing bathos of the whole show) these two powerful bodies have a vested interest in preserving the status quo. Everything you read that’s published by the British Council, by Cambridge English Examination Assessment, and by other major stakeholders in the current ELT industry needs careful scrutiny. Particular careful attention has to given to uncovering the principles which support their approach to ELT practice. In my opinion, the principles on which CELTA and DELTA, and most coursebooks are based are wrong. You might well disagree, but I think it behoves all teachers to critically assess the arguments on both sides.

Critical thinking helps progress. To the extent that more and more teachers sharpen their critical thinking, the old regime will feel increasingly less comfortable in the roost they rule. So, if you’ll allow me to climb onto this flimsy soapbox: Teachers! If you’re inspired by Dogme, teacher cooperatives and other alternatives; if you’re angry at the tawdry treatment of NNESTs and at their own pay and work conditions; if you’re bored by the showcase events mounted by IATEFL and TESOL, where the same old luminaries trot out the same old stuff; and if you’re eager to explore new approaches to teaching and to participate more in decisions affecting your teaching; sharpen your critical thinking tools!

Logic Quiz


I posted this a while ago as a follow-up to a Critical Thinking page. See how you do.

Say if the following are valid or invalid arguments:

1. Socrates is a philosopher.
All philosophers are poor.
So Socrates is poor.

2. Whenever Anil is here, Kumar is also here.
Anil is not here.
So Kumar is not here.

3. Most drug addicts are depressed people.
Most depressed people are lonely.
So most drug addicts are lonely.

4. Nothing that is cheap is good.
So nothing that is good is cheap.

5. If there is an earthquake, the detector will send a message.
No message has been sent.
So there was no earthquake.

6. John said that everyone loves Mary.
Nothing that John has said is true.
So nobody loves Mary.

7. If there is life on Mars, then Mars contains water.
If Mars has ice, it contains water.
There is ice on Mars.
So there is life on Mars.

8. All roses are flowers.
Some flowers fade quickly.
So some roses fade quickly.

9. Our government should either spend less or raise taxes.
Raising taxes is impossible.
So our government should spend less.

10. If John is guilty, so is Peter.
If Peter is not guilty, Jeremy is.
So if John is not guilty, Jeremy is.

And, as a test of your understanding of probability, try the following rather famous puzzle.

Imagine that you are a contestant on a television game show. You are shown three large doors. Behind one of the doors is a new car, and behind each of the other two is a goat. To win the car, you simply have to choose which door it is behind. When you choose a door, the host of the show opens one of the doors you have not chosen, and shows you that there is a goat behind it. You are then given a choice; you may stick with your original choice, or you may switch to the remaining closed door.

What should you do to maximize your chances of winning the car? Think about it for a while, and when you have decided, read the two arguments below and decide which is right.

• Argument 1: Suppose you choose door number 1. The probability that the car is behind door 1 is initially 1/3 (since there are three doors, and the car has an equal chance of being behind each). Then suppose the host opens door number 3 and shows you that there is a goat behind it. We then need to calculate a conditional probability–the probability that the car is behind door 1, given that there is a goat behind door 3. Since there are only two doors left, and there is an equal chance that the car is behind each of them, this probability is 1/2. But similarly, the probability that the car is behind door 2, given that there is a goat behind door three, is also 1/2. So whether you stick with door 1 or switch to door 2, your chance of winning is 1/2. So it really makes no difference whether you switch or not.

• Argument 2: Suppose you choose door number 1. There are three possibilities; either the car is behind door 1, or door 2, or door 3. Each of these possibilities has the same probability (1/3). In each of the three cases, consider which door the host will open. If the car is behind door 1, the host could open either door 2 or door 3. In this case, if you stick with your original choice you win the car, but if you switch to the remaining door you lose. If the car is behind door 2, the host will open door 3. In this case, if you stick with your original choice you lose, but if you switch, you win. Finally, if the car is behind door 3, the host will open door 2. Again, if you stick with your original choice you lose, but if you switch, you win. Remember that each of the three possibilities has a probability of 1/3, and note that they are mutually exclusive (the car is only behind one door). If you switch, you will win in two cases out of three (probability 2/3), but if you stick you will only win in one case out of three (probability 1/3). So you should switch doors, since it doubles your chance of winning.

Which argument do you think is right, Argument 1 or Argument 2?

Critical Thinking: A Few Thoughts


Some random thoughts.

  • The basic concept of critical thinking is simple. It’s the art of taking charge of your own mind.
  • Critical thinking is disciplined thinking that is clear, rational, open-minded, and informed by evidence.
  • Critical thinkers are by nature skeptical. They approach texts with suspicion.

Fallacies are common errors in reasoning that undermine the logic of an argument.

1. Begging the Claim: The conclusion that the writer should prove is validated within the claim. Example:

Well-conceived (lexically-informed) coursebooks help learners learn better than badly-conceived (grammar-informed) coursebooks.

The conclusion that should be proved is already assumed in the claim.

2. Circular Argument: This restates the argument rather than actually proving it. Example:

Coursebooks present a well-organised sequence of classroom practice and a well-organised sequence of classroom practice is good.

The conclusion and the evidence used to prove it are basically the same idea. Specific evidence is needed to support either half of the sentence.

3. Ad hominem: This is an attack on the character of a person rather than his or her opinions or arguments. Example:

Universal grammar is wrong because Chomsky is rude about those he criticises.

Obviously fallacious but very common (BTW; I challenge anybody to find examples of ad homminen arguments on my blog).

4. Ad populum: An emotional appeal that speaks to positive or negative concepts rather than the real issue at hand. Example:

If you were less interested in proving yourself right by giving obscure academic references, you would appreciate what I’m trying to say.

Obviously fallacious, but a common ploy used in those replying to my criticisms of their work.

5. Red Herring: A diversionary tactic that avoids the key issues, often by avoiding opposing arguments rather than addressing them. Example:

What you say about my inability to put together a coherent argument in well-formed sentences may have some justification, but what’s important is that you resort to gratuitous insults.

The author switches the discussion away from the point in question and talks instead about another issue.

6. Straw Man: This move oversimplifies an opponent’s viewpoint and then attacks that hollow argument.


Jane says “In CELTA, I think we should look at alternatives to the current end of course exams, such as portfolios .”

John says “If you abandon the established and proven ways of objectively assessing students’ knowledge and competencies, then you undermine the high standards we’ve set for the course.”   

Jane didn’t suggest abandoning all assessment. John is not treating the argument fairly, or refuting Jane’s position.

7. Moral Equivalence: This fallacy compares minor misdeeds with major wrongs. Example

Your suggestion that I write badly and can’t put together a coherent argument is as deplorable as a fundamental attack on human rights. 

The author compares the relatively harmless actions of an  outspoken critic with a serious attack on decent values. This comparison is unfair and inaccurate.

When studying for an MA or when walking through life, one of the very best things you can do is keep your atenna up and, to mix metaphors, sniff out bullshit. Bullshit is everywhere, and lots of it is used to defend the status quo.

Thinking critically involves never believing what you’re told without question. Sniffing out fallacies is one of the best mental exercises there is and you should make it a habit. Whenever you read a text, particularly if it’s the work of anybody in authority (academic, political, whatever), first check for logical fallacies. Then, try to detect the assumptions which inform the argument: what does the argument rest on? While there are some excellent scholars and some excellent teachers with little academic inclination writing fantastic stuff that we would do well to take notice of, there are also a bunch of fools (more and more of them teaching in universities) talking crap. We need to hone our critical thinking in order to push ahead.