
- •Summary Contents
- •Detailed Contents
- •Figures
- •Tables
- •Preface
- •The Disciplinary Players
- •Broad Perspectives
- •Some Key Guiding Principles
- •Why Did Agriculture Develop in the First Place?
- •The Significance of Agriculture vis-a-vis Hunting and Gathering
- •Group 1: The "niche" hunter-gatherers of Africa and Asia
- •Group 3: Hunter-gatherers who descend from former agriculturalists
- •To the Archaeological Record
- •The Hunter-Gatherer Background in the Levant, 19,000 to 9500 ac (Figure 3.3)
- •The Pre-Pottery Neolithic A (ca. 9500 to 8500 Bc)
- •The Pre-Pottery Neolithic B (ca. 8500 to 7000 Bc)
- •The Spread of the Neolithic Economy through Europe
- •Southern and Mediterranean Europe
- •Cyprus, Turkey, and Greece
- •The Balkans
- •The Mediterranean
- •Temperate and Northern Europe
- •The Danubians and the northern Mesolithic
- •The TRB and the Baltic
- •The British Isles
- •Hunters and farmers in prehistoric Europe
- •Agricultural Dispersals from Southwest Asia to the East
- •Central Asia
- •The Indian Subcontinent
- •The domesticated crops of the Indian subcontinent
- •The consequences of Mehrgarh
- •Western India: Balathal to jorwe
- •Southern India
- •The Ganges Basin and northeastern India
- •Europe and South Asia in a Nutshell
- •The Origins of the Native African Domesticates
- •The Archaeology of Early Agriculture in China
- •Later Developments (post-5000 ec) in the Chinese Neolithic
- •South of the Yangzi - Hemudu and Majiabang
- •The spread of agriculture south of Zhejiang
- •The Background to Agricultural Dispersal in Southeast Asia
- •Early Farmers in Mainland Southeast Asia
- •Early farmers in the Pacific
- •Some Necessary Background
- •Current Opinion on Agricultural Origins in the Americas
- •The Domesticated Crops
- •Maize
- •The other crops
- •Early Pottery in the Americas (Figure 8.3)
- •Early Farmers in the Americas
- •The Andes (Figure 8.4)
- •Amazonia
- •Middle America (with Mesoamerica)
- •The Southwest
- •Thank the Lord for the freeway (and the pipeline)
- •Immigrant Mesoamerican farmers in the Southwest?
- •Issues of Phylogeny and Reticulation
- •Introducing the Players
- •How Do Languages Change Through Time?
- •Macrofamilies, and more on the time factor
- •Languages in Competition - Language Shift
- •Languages in competition - contact-induced change
- •Indo-European
- •Indo-European from the Pontic steppes?
- •Where did PIE really originate and what can we know about it?
- •Colin Renfrew's contribution to the Indo-European debate
- •Afroasiatic
- •Elamite and Dravidian, and the Inds-Aryans
- •A multidisciplinary scenario for South Asian prehistory
- •Nilo-Saharan
- •Niger-Congo, with Bantu
- •East and Southeast Asia, and the Pacific
- •The Chinese and Mainland Southeast Asian language families
- •Austronesian
- •Piecing it together for East Asia
- •"Altaic, " and some difficult issues
- •The Trans New Guinea Phylum
- •The Americas - South and Central
- •South America
- •Middle America, Mesoamerica, and the Southwest
- •Uto-Aztecan
- •Eastern North America
- •Algonquian and Muskogean
- •Iroquoian, Siouan, and Caddoan
- •Did the First Farmers Spread Their Languages?
- •Do genes record history?
- •Southwest Asia and Europe
- •South Asia
- •Africa
- •East Asia
- •The Americas
- •Did Early Farmers Spread through Processes of Demic Diffusion?
- •Homeland, Spread, and Friction Zones, plus Overshoot
- •Notes
- •References
- •Index
Issues of Phylogeny and Reticulation
Linguists, plus those historians, archaeologists, and anthropologists who take an interest in historical comparative linguistics, have tended to view the past of language families as reflecting two processes - phylogenetic mother-language to daughterlanguage descent (the "family tree" model), and a co-evolutionary model which stresses contemporary interaction between neighboring languages (the "linguistic area" model). According to linguist Bob Dixon (1997), language families exist because of punctuated and relatively short-lived periods of expansion. They also share areal linguistic features as a result of much longer, non-punctuated processes of interaction and borrowing. As he states: "Each language has two possible kinds of similarities to other languages - genetic similarities, which are shared inheritances from a common proto-language; and areal similarities, which are due to borrowing from geographical neighbours" (Dixon 1997:15). There is a third kind of similarity - coincidental resemblance - but this is not of great significance at the level of linguistic classification under discussion here.
In major language families like Indo-European and Austronesian, the results of both processes are always evident. Far-flung languages, often thousands (even tens of thousands) of kilometers apart, share transparent phylogenetic relationships (e.g., English and Bengali, Malay and Tahitian, Navajo and the Athabaskan languages of Canada). On the other hand, languages within different families can share areal features within specific regions. Such "linguistic areas," widely discussed by linguists, include the Indian subcontinent, Mesoamerica, the Balkans, and the Amazon Basin. Yet - and this proviso requires stress - the languages within these areas always retain their phylogenetic relationships in spite of interaction. In other words, we do not find the Indian subcontinent to be full of phylogenetically unclassifiable languages that have all blended equal aspects of Indo-European, Dravidian, TibetoBurman, and Austroasiatic structure and vocabulary. Neither is this the case in the well-studied Vaupes region of Amazonia, where some groups even practice linguistic exogamy (a person should marry someone from another language group), yet language families such as Arawak and Tucanoan remain essentially
coherent despite structural convergence and universal multilingualism (Sorensen 1982; Aikhenvald 1996, 2001).
Another basic and important concept here is that the initial formation of new language families, in Dixon's view, occurs relatively quickly. Languages change constantly for both internal (genetic) and external (areal) reasons, as can be verified quite simply by comparing Anglo-Saxon with the English of Chaucer and then E. M. Forster, or by comparing modern Romance languages with Latin, their common ancestor. Because of this, it stands to reason that widespread language families such as Indo-European and Austronesian must have spread to their geographical limits sufficiently recently in time for evidence of common origin to remain in all their component subgroups. Some linguists calculate that a maximum time span for such traces to remain would fall somewhere between 7,000 and 10,000 years ago, beyond which time percentages of shared cognates in basic (culturally universal) vocabulary would drop below about 5-10 percent. Thus, if early Indo-European had taken more than 10,000 years to spread from its homeland to the far reaches of Iceland and Bangladesh, then the Indo-European family would probably not exist at all in the way so clearly identifiable by linguists today. This may seem like an obscure technicality, but it does imply that the major language families of agriculturalist populations that exist today are Holocene, not Pleistocene, phenomena. Their life histories fall well within the time span of agricultural food production.
The Identification and Phylogenetic Study of
Language Families
In the following sections I wish to examine some practical matters of language family phylogeny, making reference for illustrative purposes mainly to the Austronesian languages of Southeast Asia and Oceania, with forays here and there into other families such as Indo-European. The Austronesian family comprises about 1,000 languages spread more than halfway round the world from Madagascar to Easter Island (Figure 7.4) (Bellwood 1991, 1997a; Blust 1995a; Pawley and Ross 1993; Pawley 2003). There are perhaps 350 million Austronesian speakers, mostly in Southeast Asia, especially in Indonesia and the Philippines. Individual languages may have a few hundred indigenous speakers (like many in western Oceania) to upward of 60 million (Javanese). The Austronesians themselves mostly have Asian biological phenotypes, but many Melanesian populations in the western Pacific also speak Austronesian languages, as do Negritos in the Philippines. Cultures varied enormously in the precolonial past, from Hindu and Islamic states to forest hunting and gathering bands. Yet, by the logic described above, the Austronesian entity is not merely a result of random patterning in time and space. There is a solid core of shared phylogenetic history, both linguistic and cultural, despite vast geographic spread and the adoption of Austronesian languages by members of other, unrelated ethnic communities, especially in western Melanesia.
Moving now to some definitions, a language family is a grouping of languages that shares a unique set of identifying linguistic features, mainly as retentions from an earlier stage of linguistic history. Some of these identifying features might indeed be uniquely shared innovations generated at the point of origin (the proto-language stage), but without external witnesses in other language families it is impossible to be sure, and this issue takes us into the much-disputed level of macrofamilies, a concept to which we will return. Language families consist of subgroups of closely related languages, defined by uniquely shared innovations (rather than retentions) similar to derived or autapomorphic characters in biological cladistics. A good example here would be the uniquely shared innovations reconstructed by Robert Blust for the Proto-
Malayo-Polynesian subgroup of Austronesian, which includes all Austronesian languages apart from the Formosan languages of Taiwan. These include the use of the enclitic form *-mu for the second person singular pronoun, the loss of preconsonantal and final s, and the verbal prefixes *maand *pa-. These forms are not found in Formosan languages, and for phonological reasons Blust (1995b:620-621) interprets the situation as reflecting Proto-Malayo-Polynesian innovation, rather than Formosan loss. Such innovations will clearly have occurred during the time period prior to the break-up of the proto-language ancestral to the subgroup, going back as far as the previous ancestral phylogenetic division in the family tree. If they occurred before this last ancestral division they would not, presumably, be unique to one subgroup.
Admittedly, there can be complications in the recognition of cognates that can, in theory, skew the accuracy of historical interpretation. For instance, borrowings, if adopted by all the languages of a subgroup soon after they differentiated, can often mimic cognates. In this case, however, such early borrowings can offer the same historical implications of close geographical relationship as truly cognate (commonly inherited) forms. In addition, shared retentions from a proto-language can often masquerade as shared innovations if they are by chance preserved only by the members of a subgroup and none of their immediate neighbors, such that genuine cognates located more distantly are overlooked or not recorded. The sample density of studied languages clearly matters greatly in such cases; one must know that neighborhood absences are real and not just reflections of an absence of recording.
When family trees are constructed they reveal certain features. All have a reconstructible ancestral formulation - a proto-language (or in reality a hierarchy of multiple proto-languages) - that can be presented as a set of reconstructed features. The most important such features for prehistorians trying to reconstruct ancient cultures are ancestral lexical items with their most likely original meanings.' The reconstruction of meanings can sometimes be an ambiguous exercise if a reconstructed item has a wide and overlapping range of modem meanings, but in general such proto-lexicons can be a very powerful source of cultural data. For instance, proto-language reconstruction makes it certain that many of the major language families to be discussed later had their roots amongst agricultural peoples, not hunter-gatherers. Linguistic analysis can also sometimes show whether a given cultural trait was present continuously
since the existence of the proto-language, or whether it was introduced later. Thus, many material cultural items connected with agriculture, seafaring, fishing, and pottery-making can be shown to go back continuously in the western Pacific to the Proto-Oceanic stage, the ancestral stage (ca. 3500 BP) for all the Austronesian languages of Oceania (Pawley 1981). Likewise, according to Robert Blust (2000a), the rice vocabulary in the Chamorro language of the Mariana Islands in Micronesia was inherited directly from Proto-Malayo- Polynesian about 4,000 years ago, rather than introduced to these islands by borrowing at a later date.
One must ask, however, what was the geographical extent of any specific reconstructed proto-language. Was it a single language spoken in one village, or was it a much broader regional spread of related dialects? Long ago, Gordon Childe (1926:12) suggested for Proto-Indo-European: "The Aryan [Proto-Indo- European] cradle must have had a geographic unity; the linguistic data alone presuppose a block of allied dialects constituting a linguistic continuum within a specific area and under more or less uniform geographical conditions." A similar view is held by Lehmann (1993:15): "Yet is clear that some social group, of whatever size or coherence, at one time spoke the relatively unified language labelled Proto-Indo-European and that this group maintained a specific culture." This need not mean an origin from a single ancestral community such as one village. As Dixon (1997:98) notes, any given language family "may have emanated not from a single language, but from a small areal group of distinct languages, with similar structures and forms." However, the concept of an original linguistic unity cannot be broken down too far in the direction of multiple unrelated languages.
Another aspect of language family trees is that they may be strong or weak, or indeed both at different times in their courses of development. Strong trees are like "real" trees in shape, with an identifiable root and subsequent sequential branches (Figure 9.1A). As an example, the Polynesian languages developed a strong tree-like structure in their earlier period of differentiation, with a very well-defined protolanguage (Proto-Polynesian) that contained, according to Andrew Pawley (1996), a remarkable number of innovations, including up to 1,392 lexical items (although some of these could have undetected cognates outside Polynesia), 14 morphological and 8 grammatical features. These were all created in a unified and well-bounded homeland region situated between
about 1000 Bc and AD 500 in Western Polynesia, in particular the Tongan and Samoan Islands (Kirch and Green 2001). This implies a long period of standstill, known in this case from archaeological correlations to have lasted more than 1,000 years, during which time populations remained in contact and shared large numbers of widespread linguistic changes that never spread to other neighboring areas such as Fiji or Vanuatu. Polynesian is thus an excellent example of an innovation defined subgroup.'
Language families with weak phylogenies have rake-like structures with lots of coordinate and independent subgroup branches, and no coherent root (Figure 9.1B). The proto-language vocabularies of the Malayo-Polynesian subgroups of Austronesian in the Philippines, eastern Indonesia, and western Oceania, as reconstructed by Robert Blust (Figure 10.9 inset), share quite a uniform basic vocabulary (Blust 1993; Pawley 1999). They are essentially innovation-linked subgroups in the terminology of Andrew Pawley and Malcolm Ross (1993, 1995). This suggests that subgroup differentiation occurred from a very widely spread dialect chain with no significant buildup of innovations in any region. In this case, the foundation dialect spread was rapid, perhaps under 500 years (very rapid for such a vast area), and there were no major foci of linguistic isolation within which uniquely shared innovations could be accumulated (Pawley 1999). As it happens, archaeology supports these historical linguistic reconstructions of rapid primary spread within the Austronesian family very accurately (Bellwood 2000c). English in North America and Australia would no doubt yield similarly rake-like subgroups were tribal societies of Neolithic (and non-literate) type to come back and dominate the world for the next 2,000 years. This is because the spreads of English in both continents were so rapid and so undifferentiated that no points of initial arrival in the respective land masses, and no directions of subsequent spread, would ever be recognizable in the pattern of subsequent phylogenetic differentiation.

Figure 9.1 Strong and weak family trees. In A, the nodes are innovation-defined (strong) and well distinguished, by virtue of both geographical and chronological separation. In B, all nodes are innovation-linked (weak) and derived from rapid rake-like dispersal, thus relatively undifferentiated from each other.
The concept of the linguistic family tree can imply that languages split irrevocably once their speakers moved apart. But, in reality, ancient colonists would rarely have passed into such utter isolation that their languages separated with absolute finality. Such might have happened in remote places such as Easter Island, owing to the sheer difficulty of getting there, but in general we would expect communication between spreading daughter dialects to continue for as long as was physically possible. As Pawley and Ross (1995:20) point out, descendants of the Austronesian colonists of the large land masses of New Zealand and Madagascar "were generally mobile enough to maintain fairly cohesive dialect networks over large islands and island groups for up to 1000 years or so." Such broad networks remind one of the broad networks of relative homogeneity characteristic of the earliest Neolithic assemblages in many parts
of the world. Thus, dispersal need not lead to immediate isolation, although the rate of diversification will be compounded if groups come into intensive contact with people speaking completely unrelated languages. Such happened with the Austronesian speakers who settled the Papuan-speaking regions of western Melanesia after 3,500 years ago. Their rate of lexical differentiation increased greatly, thus leading in the 1960s to an erroneous opinion, derived from lexicostatisical calculations, that Melanesia was actually the Austronesian homeland (Dyen 1965; Murdock 1968).
As far as language family homelands are concerned it is generally assumed, as in paleontology, that the most likely homelands are those where the deepest (i.e., oldest) subgroup separations, or bifurcations in the family tree, occur. Remember from the above, however, that one needs a strong family tree in order to be able to determine this. Rake-like formations, if they form the whole of a language family, make it essentially impossible to locate a homeland with precision. Luckily, many language families contain both rake-like and tree-like subgrouping structures. For instance, most subgroups of Indo-European have rake-like relationships, but the ancient Anatolian languages have sufficient unique features to make it possible that the IndoEuropean homeland was located somewhere in Turkey (brews 2001). Likewise, the Malayo-Polynesian subgroups are also rake-like, as noted above, but the Formosan languages, which do not belong to Malayo-Polynesian and form several separate first-order subgroups within Austronesian, suggest very strongly a Taiwan homeland. Another point to note is that a language family need not always have originated in the center of its current distribution. Such reasoning manifestly does not work for families such as Austronesian, or Benue-Congo (including Bantu), where spreads out of homelands were essentially unidirectional in the early periods.