Glottochronology
glottochronology is a technique for calculating the temporal separation or divergence between two languages that are supposed to be related. It is based on the percentage of words or cognates that are substituted for other words over time. Morris Swadesh, based on data from different language families, whose history is known from documents, estimated that due to internal changes and external contributions, approximately 14% of the basic vocabulary words of a language will be replaced every thousand years. Glottochronology is a study method whose results cannot be precise. However, it is proposed for the investigation of the evolution of those languages that lack written texts, so the comparative method, used for the reconstruction of Indo-European, must be discarded and research alternatives must be sought.
Glottochronology of Swadesh
Basic Assumptions
Explicitly the two strong assumptions on which the glottochronological method is based are:
- The lexical substitution rate is approximately constant if we consider very long periods, at least for the lexicon of the so-called basic vocabulary.
- The lexical substitution rate is approximately uniform among all the languages of the world, at least for the basic vocabulary.
In equation form, these two assumptions can be expressed as follows. If we call P(t) the expected percentage of basic words that a language retains after the period t, then the first assumption can be written in mathematical form as:
dP(t)dt=− − α α P(t){displaystyle {frac {dP(t)}{dt}}=-alpha P(t)}
Or equivalently if we integrate the previous differential equation:
P(t)=100⋅ ⋅ e− − α α t{displaystyle P(t)=100cdot e^{-alpha t}}
The second assumption leads to the fact that α must be a universal constant for all languages of the world, which must be calibrated from experimental data of basic vocabulary replacement in languages with long written records. Estimates suggest that after 10 centuries of historical evolution the average, measured for several families of languages, is around 14% of the basic vocabulary, this leads to the following estimate of α:
α α =− − ln (1− − 14100)≈ ≈ 0,1508⋅ ⋅ Millennium− − 1{displaystyle alpha =-ln left(1-{frac {14}{100}right)approx 0.1508cdot {mbox{milenio}}}{{-1}}}}
From that estimate, the separation time Ts (measured in millennia) can be estimated by comparing the percentage of cognates retained pC (expressed in so much for one) as:
Ts=log (pC)log (p0){displaystyle T_{s}={frac {log(p_{C}}}{log(p_{0}}}}}}}}}}of a tongue with his mother tongue, Ts≈ ≈ log (pC)2⋅ ⋅ log (p0){displaystyle T_{s}approx {frac {log(p_{C})}{2cdot log(p_{0})}}}}of two sister languages
Where p0 = 0.86 (= 86%) is Swadesh's estimated percentage retention per millennium (= 100% - 14%). The time thus calculated has been compared to that of carbon 14 in biological matter, used in archaeology. The method could calculate the date when two or more related languages would have formed a common stem. Indeed, the two methods are similar, but they differ in one essential fact: the decay of C14 is always constant. There is nothing that changes it, neither heat nor cold nor reactions with other chemical elements.
Calibration
Morris Swadesh estimated that the replacement rate on the list of 100 basic vocabulary terms he developed was around 14%, so the retention rate is p0= 0.86 (= 86%). Later, Robert Lees changed the mean value, calculating it at 80.5% every millennium. Other linguists have estimated retention rates of 92-95% by eliminating the borrowings and focusing only on the "native" inside the tongue.
It can be shown, however, that if the list is made up of terms of different stability (individual rate of change), then the rate of change of the list cannot be constant but rather decreases as the more volatile or unstable words are being replaced, since the proportion of more stable retained words is increasing (in proportion to the retained words). This fact has been analyzed in various subsequent modifications of glottochronology.
Statistical fluctuations can also be important since the number of retained cognates follows a binomial distribution of the type:
Pk=(100k)pk(1− − p)100− − k{displaystyle P_{k}={begin{pmatrix}100kend{pmatrix}}{p^{k}(1-p)^{100-k}}}
For different values of the retention rate it can be seen that the expected value of the separation time diverges from the theoretical value:
Author | Retention rate p{displaystyle p,} | Rate of change 1− − p{displaystyle 1-p,} | Time of separation (theoretical) | Time of separation (average) μ μ {displaystyle mu ,} | Time of separation (dev. est.) σ σ {displaystyle sigma ,} |
---|---|---|---|---|---|
M. Swadesh | 0.86 | 0.14 | 1000 years | 1005 years | 8,51 years |
R. Lees | 0,805 | 0.195 | 1000 years | 1006 years | 7.22 years |
0.93 | 0.07 | 1000 years | 1005 years | 12,0 years |
Criticism
The original methodology has been repeatedly criticized. Eugenio Coseriu (1962), for example, provided some data taken from the Romance languages illustrating that their use could lead to significant inaccuracies. The main objections that have been raised against Swadesh's formulation are as follows:
- The retention constant is not universal, it varies according to the time, language and meaning of the word included in the list.
- The model of the genealogical tree (StammbaummodelIt's not quite clear. Languages are often influenced later by their separation, while the original glotochronological assumption is that this subsequent contact is not given or is unsignificant (in both cases that seems not to be true).
- Sound changes could cause it not to be possible to recognize two cognados (e.g., fr. Chef e ing. head) or that the mistake is made to assume as cognated words that are not really related (for example, ing. day and esp. Day).
- In some languages there are multiple synonyms for one of the words of basic meaning. To solve this problem, it has been suggested to use the most common colloquial equivalent, the most common word, or to choose a random one.
- A couple of words can be cognated in part, like esp. Sun and fr. soleil. In these cases it is proposed to use these items as cognated or assign fractions instead of whole numbers.
- In some languages, the basic vocabulary contains language loans. In these cases, the word is not taken into account.
- In some languages one or more words of the basic vocabulary are missing. In these cases, the number of words on the list is reduced as appropriate.
Modified Glottochronology
The classical glottochronology proposed by Morris Swadesh was severely criticized in the 1960s and 1970s, to the point that it was dismissed by many linguists. Some of the criticisms were already addressed in those years and modifications were proposed that totally or partially saved some of the criticisms. The reconsideration of the critics gave rise to a vision between the complete rejection of the value of glottochronology and the enthusiasm with which it was initially received. Proponents of modified glottochronology argue that with major modifications the idea of using cognate number counts is useful for studying the diversification of language families and estimating separation times.
Van der Merwe (1966) studied the effect of inhomogeneity on replacement rates, dividing the vocabulary list into several groups and each group with its own rate. This modification has the effect that in the long term the average replacement rate decreases, since at first the words with the highest rate of change disappear and after a time the proportion of words with slower replacement rates increase in proportion. Dyen, James and Cole (1967) studied the effect of allowing each meaning to have its own rate of change. If the basic vocabulary list is divided into two groups with N1 and N2 words then the effective exchange rate λ λ t{displaystyle lambda _{t},}satisfies:
(N1+N2)e− − λ λ tt=N1e− − λ λ 1t+N2e− − λ λ 2t{displaystyle (N_{1}+N_{2})e^{-lambda _{t}}=N_{1}e^{-lambda _{1}t}+N_{2}e^{-lambda _{2}t}}
So the effective time-dependent rate can be written as:
λ λ t=− − 1tln (n1e− − λ λ 1t+n2e− − λ λ 2t)≈ ≈ (n1λ λ 1+n2λ λ 2)− − n1n22(λ λ 1− − λ λ 2)2t+n1n26(n1− − n2)(λ λ 1− − λ λ 2)3t2+...... ### ############################################################################################### #######################################################################################################################
Where:
- n1=N1N1+N2,n2=N2N1+N2{displaystyle n_{1}={frac {N_{1}}{N_{1}{1}{2}}}}}{n_{2}={frac {N_{2}}}}{N_{1}}{2}}}}}}, indicate the proportion of words in group 1 and of words in group 2.
- λ λ 1,λ λ 2{displaystyle lambda _{1},lambda _{2},} are the exchange rates for the words of each group.
It can be seen that λ λ t{displaystyle lambda _{t},} for small separation times t≈ ≈ 0{displaystyle tapprox 0} behaves as the average variation rate of both groups.
Finally, Kruskal, Dyen and Black simultaneously studied the estimate of divergence over time in addition to the replacement rate. Sankoff (1973) suggested introducing a loan parameter and allowed the occurrence of synonyms. The combination of several such improvements is considered in Sankoff. From a work by Sankoff on the genetic divergence of populations in a biological context, Embleton (1981) derives a simplified version in the linguistic context. This author showed through a certain number of simulations that using this model gives good results. Collaterally, improvements in statistical methodology in a different field, the study of DNA changes over time, have generated work that uses those results in the linguistic context and have renewed interest in glottochronology. All of these methods are more robust than those previously used and allow points in the phylogenetic tree to be calibrated from datatable historical events, continuously interpolating the rates of change between them. The result is that the assumption of a constant rate of change is no longer necessary.
Starostin method
Another attempt to introduce modifications to the traditional glottochronology was carried out by the Russian linguist Sergei Starostin, who proposed that the following modifications:
- Eliminate lexicon loans, as these are a disruptive factor that alters the results, so Stárostin concentrates on the changes due to the native "replacement" by words of the same language. Mistakes committed by not adjusting this factor are the main reason why Swadesh's estimates were 14 millennial replacements in the list of 100 terms (which gives a rate of 0.14), while the actual rate is much slower (from the order of 0.05 or 0.06 per millennium). If this correction is introduced, the criticism of Bergsland and Vogt is eliminated (since through the analysis of riksmal data it is shown that the basic list includes 15–16 loans from other Germanic languages (mainly Danish).
- the rate of change is not really constant, but varies with time. This can be due to at least two different factors:
- The possibility that a lexema X is replaced by a lexema And increases with the time that X is used in language, this effect can be seen as a "aging of words", which is empirically related to the fact that there is a gradual "erosion" of the primary meaning due to the weight acquired by secondary meanings developed from the first (see semantic change).
- Words from the vocabulary list have different rates of individual change (for example, the word for 'me' is usually more resistant to change than the word for 'yellow'). As discussed earlier, the less stable words with the highest rate of change disappear before, but as time passes, the proportion of retained words more resistant to change increases relative to the less resistant. The formula proposed by Stárostin, which takes into account the dependence of individual stability, has the form:
Ts=ln (pc)− − Lc{displaystyle T_{s}={sqrt {frac {ln(p_{c}}}}{-L_{c}}}}}}}}}
Which replaces the Swadesh formula
Ts=ln (pc)ln (p0){displaystyle T_{s}={frac {ln(p_{c}}}{ln(p_{0}}}}}}}}}}}
Antiquity of some families
Some estimates of the age of some language families have been made. Among the families with antiquities or temporal depths of less than 5000 years (50 centuries), practically all the well-established families appear for which it has been possible to adequately reconstruct (RA) the crucial aspects of the protolanguage, in addition to some other families where the relationship between the languages is little controversial:
- Indo-European languages (RA, Eurasia, 70 centuries)
- Sino-tibetan languages (RA, Far East, 60 centuries)
- South Caucasian languages (RA, Eurasia, 40 centuries)
- Ravidic languages (RA, Indian Subcontinent, 40 centuries)
- Austrian languages (RA, Southeast Asia, Oceania and Taiwan, 35 centuries)
- Tai-kadai languages (RA, Southeast Asia, 30 centuries)
- Hmong-mien languages (RA, Southeast Asia, 40 centuries)
- Yenish languages (RA, Siberia, 30 centuries)
- Chucoto-camchatcas languages (RA, Siberia, 40 centuries)
- Esquimo-Germanian languages (RA, Siberia, 30 centuries)
- Na-dené languages (RA, North America, 35 centuries)
- Allergy languages (RA, North America, 30 centuries)
- Iroque languages (RA, North America, 35 centuries)
- Salish languages (RA, North America, 45 centuries)
- Caddoan languages (RA, North America, 33 centuries)
- Utobatical languages (RA, North America, 48 centuries)
- Zapotec languages (RA, Mesoamerica, 25 centuries)
- Mayan languages (RA, Mesoamerica, 41 centuries)
- Mysumal Languages (RA, Central America, 43 centuries)
For some macrofamilies (MF) and more controversial families in which the reconstruction has encountered more difficulties (DR), there are temporal depths greater than 50 centuries:
- High languages (MF, Eurasia, 77 centuries)
- Urálican languages (DR-b, Eurasia, 60 centuries)
- Afro-Asian languages (DR-c, North Africa and the Middle East, 113 centuries)
- Niger-Congo languages (DR-a, Africa, 100 centuries)
- Nile-Saharian languages (DR-c, Africa, 150 centuries)
- Jordanian languages (MF, Southern Africa, 111 centuries)
- Trans-neoguinean languages(MF, New Guinea, 100 centuries)
- Australian Aboriginal Languages(MF, Australia, 95 centuries)
- Omani languages(DR-b?, Mesoamerica, 55-60 centuries)
- Hokan languages(MF, Mesoamerica, 88 centuries)
Contenido relacionado
Luxembourgish language
Languages of Equatorial Guinea
Last name