The Himalayas as a Directional Barrier to Gene Flow page 1 - 52
The Himalayas as a Directional Barrier to Gene Flow
Abstract
High-resolution Y-chromosome haplogroup analyses coupled with Y–short tandem repeat (STR) haplotypes were used to (1) investigate the genetic affinities of three populations from Nepal—including Newar, Tamang, and people from cosmopolitan Kathmandu (referred to as “Kathmandu” subsequently)—as well as a collection from Tibet and (2) evaluate whether the Himalayan mountain range represents a geographic barrier for gene flow between the Tibetan plateau and the South Asian subcontinent. The results suggest that the Tibetans and Nepalese are in part descendants of Tibeto-Burman–speaking groups originating from Northeast Asia. All four populations are represented predominantly by haplogroup O3a5-M134–derived chromosomes, whose Y-STR–based age (±SE) was estimated at 8.1±2.9 thousand years ago (KYA), more recent than its Southeast Asian counterpart. The most pronounced difference between the two regions is reflected in the opposing high-frequency distributions of haplogroups D in Tibet and R in Nepal. With the exception of Tamang, both Newar and Kathmandu exhibit considerable similarities to the Indian Y-haplogroup distribution, particularly in their haplogroup R and H composition. These results indicate gene flow from the Indian subcontinent and, in the case of haplogroup R, from Eurasia as well, a conclusion that is also supported by the admixture analysis. In contrast, whereas haplogroup D is completely absent in Nepal, it accounts for 50.6% of the Tibetan Y-chromosome gene pool. Coalescent analyses suggest that the expansion of haplogroup D derivatives—namely, D1-M15 and D3-P47 in Tibet—involved two different demographic events (5.1±1.8 and 11.3±3.7 KYA, respectively) that are more recent than those of D2-M55 representatives common in Japan. Low frequencies, relative to Nepal, of haplogroup J and R lineages in Tibet are also consistent with restricted gene flow from the subcontinent. Yet the presence of haplogroup O3a5-M134 representatives in Nepal indicates that the Himalayas have been permeable to dispersals from the east. These genetic patterns suggest that this cordillera has been a biased bidirectional barrier.
The Himalayan mountain range extends from Pakistan in the west to Burma in the east, along the frontiers of northern India, Nepal, Tibet, and Bhutan. It is home to most of the highest mountains of the world and forms a natural barrier between the Tibetan plateau and the Indian subcontinent. Whereas the Himalayas border Tibet to the south, the Tibetan plateau is bounded on the north by the Kunlun Mountains and on the west by the Karakoram ranges. These unique geographic characteristics of the Tibetan landscape may have provided some degree of genetic encapsulation.
Archaeological findings have revealed late Paleolithic inhabitation of the Tibetan plateau, dating the initial entry of modern humans at ∼25–30 thousand years ago (KYA).1 Yet, the discovery of Neolithic sites,1,2 genetic data,3,4 and linguistic studies5–7 favor the peopling of the plateau during the Neolithic period.
Tibeto-Burman speakers are the major inhabitants of the Himalayas. They occupy the territories of present-day Bhutan, Burma, Nepal, northeastern India, and Tibet. This family of languages is also spoken in some parts of Southeast Asia.4,8 According to historical records, Di-Qiang tribes of northwestern China migrated south ∼3 KYA, admixing with native residents on arrival.4,8–10 Su and collaborators4 suggested that the Bodic and Baric branches11 of the Tibeto-Burman subfamily populated Tibet and Nepal, respectively, ∼5–6 KYA. However, van Driem6,7 argues that the model presented by Su et al.4 is flawed because of an inadequate representation of the Tibeto-Burman speakers in their study, as well as the problems associated with the linguistic phylogeny presented by Matisoff.11
Previous genetic studies that used classic markers,2,12 autosomal microsatellites,13 and mtDNA14 have placed Tibetans along with Koreans, Japanese, and Mongolians in a Northeast Asian cluster, thereby suggesting a northern Mongoloid origin. Yet a report employing Y-chromosome biallelic markers argues for the peopling of Tibet, Nepal, and Bhutan by East Asians after they left the Yellow River region in China.4 Other studies suggest that the high frequency of the Y Alu insertion (YAP) in the Tibetan population signals a significant genetic contribution from Central Asia.4,15,16 Unfortunately, some of the aforementioned studies4,16 suffer shortcomings because of the small number of binary markers employed, limited sample sizes, and the absence of Central Asian samples needed to assess their role in the genetic composition of the Tibetan population.
In contrast to the extensive anthropological and genetic studies undertaken to analyze the rich genetic heritage and cultural diversity of the Indian subcontinent,2,17–23 limited genetic investigations had been performed on the neighboring Himalayan region of Nepal, just south of Tibet across the Himalayas.24,25 Kraaijenbrink et al.24 reported the allele-frequency distributions of 21 autosomal STR loci, whereas the other study, by employing Y-STR analysis, revealed isolation and genetic drift in the Nepalese population.25 Nepal is located between two ancient cultural giants, India and Tibet, and their influence in shaping Nepal's contemporary genetic landscape is undeniably significant. India and Nepal share a long historical, cultural, and religious legacy resulting from geographic proximity and several pivotal historical events involving political control by Indian royalty.
The genetic diversity of populations inhabiting an area is often influenced by the geographic and physical features encompassing the region. Whereas the Hindu Kush Mountains and the arid deserts in Iran have served as obstacles to gene flow,26,27 the Nile River Valley,28 the strait of Bab-el Mandeb,29,30 and Beringia31 are examples of natural passageways for the migrations of modern humans. The Himalayan range, in addition to being a formidable barrier, provides for dramatically diverse climatic conditions on either side of it: arid and cold on the Tibetan side, as compared with monsoons and extreme dry spells, depending on the season of the year, in the Nepalese territory.
We decided to investigate the influence of the Himalayas in populating the region by use of high-resolution Y-chromosome SNP analyses of four geographically targeted populations from the area—namely, Newar, Tamang, and the general population of Kathmandu from Nepal (referred to as “Kathmandu” subsequently), as well as a collection from Tibet. In addition, 15 Y-STR loci were typed to provide information on the temporal origins of these four Himalayan groups. This study is the first of its kind reporting biallelic markers of the Y-chromosome haplogroup diversity in Nepalese populations. This work improves on earlier reports by analyzing a large number of Y-chromosome binary and associated STR markers to define compound Y-chromosome lineages for a sizable number of individuals from informative Tibetan and Nepalese populations. In the process, we uncovered evidence that the Himalayas have acted as an obstacle modulating the skewed dispersal of human groups southward.
Material and Methods
Sample Collection and DNA Isolation
Blood samples were collected after receipt of informed consent from 344 males who comprise the general population of Tibet (n=156) and three populations from Nepal (n=188), which include Tamang (n=45), Newar (n=66), and Kathmandu (n=77). Figure 1 illustrates the geographical location of the above four populations. With the exception of the Kathmandu group who speak predominantly Nepali, an Indo-European language, the individuals in the remaining three collections are Tibeto-Burman speakers. Genealogical information of sampled individuals was recorded for at least 2 previous generations. Table 1 lists the sample size, geography, and linguistic affiliation of the populations examined. All ethical guidelines were followed, as stipulated by the institutions involved in the study. DNA was extracted using phenol-chloroform extraction as described elsewhere38 and was stored in 10 mM Tris–1 mM EDTA (TE) (pH 8.0), as stock solutions, at −80°C.
Y Haplogrouping
A total of 103 binary markers were genotyped by standard methods, including PCR-RFLP,30 the YAP polymorphic Alu insertion (PAI),39 and allele-specific PCR.27,40 The nomenclature followed for the Y-SNP haplogroup is as recommended by the Y Chromosome Consortium.41 Five Kathmandu males failed to amplify for marker M12 because of an ∼2.45-Mb deletion that also encompasses the amelogenin gene.42 Marker P47 was genotyped by Tsp509I RFLP analysis, which identifies a C→T transition at position 171 within the 364-bp PCR fragment amplified using primers 5′-CTGATGTTGCAGTGTTGAGC-3′ (forward) and 5′-ACACAGCCAAATACCAGTCG-3′ (reverse).
Y-STR Haplotyping and Time Estimation
Y-STR haplotypes were assayed using the AmpFlSTR Yfiler PCR amplification kit (Applied Biosystems) that coamplifies the following 17 Y-STR loci: DYS19, DYS385 a/b, DYS389I/II, DYS390, DYS391, DYS392, DYS393, DYS437, DYS438, DYS439, DYS448, DYS456, DYS458, DYS635, and Y-GATA H4. DYS385a and DYS385b were not used in time estimation and variance calculations because of the duplicated nature of the loci. The PCR products were separated on a 3100 Genetic Analyzer (Applied Biosystems), and the data generated were analyzed with the GeneScan 3.7 and Genotyper 3.7 NT software. We treated DYS389CD as equivalent to DYS389I and DYS389AB as equivalent to the numerical subtraction of DYS389II minus DYS389I.35,43,44
Microsatellite variances were calculated using the Vp equation of Kayser et al.45 Haplogroup ages, based on Y-STR variation, were estimated using the linear expansion,46,47 BATWING,48 and SNP-STR coalescence methods.44 All three methods assume an average Y-STR mutation rate of 0.00069 per locus per generation,44 with an intergeneration time of 25 years.30,49,50 With the exception of a prior distribution for the population growth rate (alpha) at gamma (1.01, 1),30 other prior assumptions for the BATWING analysis were as described by Cinnioğlu et al.50
Statistical and Phylogenetic Analyses
Twenty-four reference populations (table 1) from the published literature were included for statistical and phylogenetic analyses. Correspondence analysis (CA) was performed using the NTSYSpc-2.02i software.51 The frequency data were analyzed at the phylogenetic resolution of major haplogroups that define clades A–R in the Y tree. Admixture estimation was performed52,53 using the statistical package SPSS 14.0, with Northeast Asia, Southeast Asia, Central Asia, and India35 as parental populations for the four Himalayan populations. Tibet is excluded from the Central Asian populations35 in the admixture analysis.
Results
Phylogeography
Of the 103 binary markers analyzed, 14, 27, 28, and 29 loci were found to be polymorphic in Tamang, Newar, Tibet, and Kathmandu, respectively. Altogether, the 39 polymorphic loci in these four groups define 24 paternal haplogroups. Overall, Tamang exhibits high homogeneity with only 5 haplogroups, whereas Kathmandu shows the highest heterogeneity with 16 haplogroups, followed by Tibet with 15 and Newar with 8. It is noteworthy that only three haplogroups—namely, O3a5a-M117, R1a1-M198, and R2-M124—are shared across the four groups. Neither haplogroup G-M201 nor haplogroup L-M20 was observed. Figure 2 displays the hierarchical phylogenetic relationships and frequencies of the 24 Y-chromosome haplogroups observed in the Tibetan and Nepalese collections. The geographic distribution of the major haplogroups is illustrated in figure 3.
Haplogroup O predominates in the Himalayan collections examined, accounting for an average frequency of 35.46% (frequency range of 20.8%–86.6%) (fig. 2). The majority of haplogroup O individuals are represented by subclade O3-M122 and its derived subtype O3a5-M134, which includes the terminal mutation O3a5a-M117 (fig. 2).
The second-most abundant major clade is haplogroup R, which occurs at an average frequency of 24.70% (frequency range of 2.5%–62.1%). It includes most of the Newar (62.1%) and Kathmandu (46.8%) groups but is significantly less represented in Tamang (8.8%) and Tibet (2.5%). Haplogroup R1b1a-M269 is found only in Newar (10.6%), whereas R1a1-M198 is present in all four collections (fig. 2). Similarly, haplogroup R2-M124 occurs in all four Himalayan populations, generally exhibiting frequencies equivalent to R1a1-M198, except for the Kathmandu, where R2-M124 (10.4%) is less frequent than R1a1-M198 (35.1%) (fig. 2).
Haplogroup D is restricted to the Tibetans, accounting for 50.6% of the male population. Haplogroup DE-YAP is completely absent in the three Nepalese collections. Furthermore, haplogroup D is characterized in Tibetans by subclades D1-M15 (28.2%), D3-P47 (18.6%), and D*-M174 (3.8%), whereas no derived individuals for D2-M55 were observed.
Other important haplogroups include haplogroup H, found in Newar (6.1%), Kathmandu (11.7%), and Tibet (1.9%) but absent in Tamang. Haplogroup C3-M217 is generally common in Northeast and Central Asia and is present at similar frequencies in Kathmandu (2.6%) and Tibet (2.6%). The newly described C5-M356 marker,23 thus far restricted to Indian populations, is found in both the Newar (3%) and Kathmandu (1.3%) collections.
Statistical and Phylogenetic Analyses
Figure 4 illustrates a CA plot based on 28 populations (table 1), including the four Himalayan groups examined in this study and the 24 reference populations genotyped elsewhere at similar molecular resolution. All Southeast Asian populations, except for the Philippines, cluster together on the upper left quadrant of the graph, along with two Northeast Asian (Japan and Korea), one Central Asian (Tibet), and two South Asian groups (Adi and Tamang). The assembly of these five populations within the Southeast Asian cluster may be accounted for by the extensive sharing of the O haplogroup among them. The unique partitioning of the Philippines group may be the result of the unusually high frequency of the paraphyletic K-M9*,37 varieties common in Oceania populations.54 The lower center of the plot depicts the Northeast Asian populations. The Philippines and Manchus segregate intermediate between the southern and northern East Asian clusters. The distributions of the Central Asian, South Asian, and Middle Eastern groups in the plot mirror their geographic distance with respect to each other.
Haplogroup Dating and Microsatellite Variation
Table 2 presents the variance, continuous expansion, divergence times,44 and median BATWING values of the O3a5-M134, R1a1-M198, D1-M15, and D3-P47 haplogroups. There is general agreement between the two methods of time estimation, except in the case of the Tamang O3a5-M134 haplogroup, in which the BATWING analysis estimates 0.6 KYA as opposed to the much older estimation of 10.8±5.6 KYA determined using the coalescent method of Zhivotovsky et al.44 Unless otherwise stated, the time estimates followed throughout this text are those calculated according to Zhivotovsky et al.44 Y-STR genotypes are provided in tables tables333–6.
The divergence times calculated for the haplogroups on the basis of Y-STR data resulted in few clusters of individuals with different age estimates. When analyzed across all four populations, O3a5-M134 resulted in one major cluster consisting of 79 individuals with an age (±SE) of 8.1±2.9 KYA and a minor cluster comprising only 9 Tamang samples, which exhibits a relatively young age of 1.3±1.3 KYA. The rest of the individuals do not cluster and can be interpreted as part of an ancient component of the O3a5-M134 haplogroup whose age is estimated at 22.1±4.4 KYA.
It is noteworthy that the haplogroup D-M174 subclades D1-M15 and D3-P47 segregate into two distinct groups at locus DYS392. Those with seven repeats are entirely represented by individuals belonging to haplogroup D3-P47, whereas the cluster with 11 repeats is made up exclusively of D1-M15 chromosomes. Both D1-M15 and D3-P47, in turn, form two subclusters each whose ages are 5.1±1.8 and 2.3±1.0 KYA and 11.3±3.7 and 7.0±4.2 KYA, respectively.
Admixture Proportions
Table 7 displays the admixture proportions for Tibetan and Nepalese populations estimated using the weighted-least-squares method.52,53 The admixture proportions were calculated using four parental populations, including Northeast Asia, Southeast Asia, Central Asia, and India. The results indicate that Tibet and Tamang have been the recipients of significant Y-chromosome influence from Southeast Asia (60.4% and 66.2%, respectively) compared with Central Asia (26.2% and 28.7%, respectively), Northeast Asia (8.9% and 5.1%, respectively), and India (4.5% and 0.0%, respectively). In contrast, Central Asia and India account for the total contribution to Newar (56.6% and 43.4%, respectively) and for the vast majority of the Kathmandu Y chromosomes (51.4% and 48.6%, respectively), except for the negligible influence from Southeast Asia (1.3%).
Discussion
East Asian Influence on Himalayan Populations
The Himalayan populations examined in the present study exhibit high frequencies of haplogroup O3-M122, with its derivative O3a5-M134 accounting for the majority of samples. O3-M122 is present at low frequencies throughout Central Asia, Siberia,26 and Pakistan,23 whereas it is widely distributed in East Asia and among Tibeto-Burman groups from northeastern India.4,20,22,23 The high percentage of O3a5-M134 in Tamang (86.6%) is comparable to the frequencies observed in Tibeto-Burman–speaking groups from Northeast India (∼85%), including Adi, Naga, Apatani, and Nishi.4,20 The fact that O3a5-M134 is also present (28.9%) as the M117 subclade in Tibet may be indicative of a common ancestry for this language family, as suggested by Su et al.4 This affinity is also reflected in the CA graph (fig. 4). Notable is the observation that O3a5a-M117 is common in Tibeto-Burman speakers from Southeast Asia.10
On the basis of historical records, the Tibeto-Burman people are believed to have originated from Di-Qiang tribes that migrated south from the Yellow River valley in the central plain of East Asia. A central East Asian origin is further supported by the complete absence of Southeast Asian markers O1-M119 and O2-M268 in all populations from the present study. In contrast, van Driem6,7 argues for Sichuan, in southwestern China, as the homeland of the Tibeto-Burman language family, with subsequent northward migration to the fertile loess plains of the Yellow River basin. He classified the Tibeto-Burman language into an eastern and western group. The former is further subdivided into northern (Sino-Bodic) and southern languages. The Bodic group, one of the two variants of the Sino-Bodic language, is believed to have spread from Gansu with the dispersal of the Majiayao Neolithic culture southward into eastern Tibet, Bhutan, and Sikkim in the 3rd millennium b.c.7
Contrary to the Tamang and Tibetans, the Newar and Kathmandu groups exhibit relatively lower frequencies of O3a5-M134, perhaps because of subsequent admixture with Indians, as suggested by the distributions of haplogroup R and H profiles. Given their geographic proximity and the influences India has historically exercised over Nepal, a significant Indian contribution to the Newar and Kathmandu gene pool is plausible. Admixture analysis further supports this, since Newar and Kathmandu both show high admixture proportions from India (43.4% and 48.6%, respectively), whereas the opposite is observed in Tamang, which has a high Southeast Asian contribution (66.2%) and null contribution from India.
The divergence times obtained for O3a5-M134 with use of the method described by Zhivotovsky et al.44,55 generated an older age for Tamang (10.8±5.6 KYA), whereas similar values were estimated for Newar (7.6±2.8 KYA), Tibet (7.6±2.3 KYA), and Kathmandu (6.6±3.6 KYA). Whereas the median BATWING values are comparable for the latter three populations, the age estimate for O3a5-M134 in Tamang is extremely recent (0.6 KYA). When the large lower and upper quantiles (0.0 and 521.8 KYA, respectively) are considered, the coalescence time from the BATWING analysis may be subject to inaccuracy. The overall age estimated for haplogroup O3a5-M134 in all four populations (8.1±2.9 KYA) is more recent than its Southeast Asian counterpart (25.36±1.16 KYA).10 Therefore, the data support an immigration scenario in which this haplogroup dispersed from Southeast Asia into the Himalayan region more recently. The age of haplogroup O3a5-M134 obtained here (8.1±2.9 KYA) is within the range of the age estimated by Su et al.4 for the same marker (∼5–6 KYA), supporting the earlier observation that it may have been introduced to Tibet during the Neolithic expansion from the Yellow River basin in China.4 However, the more ancient age (22.1±4.4 KYA) obtained for O3a5-M134 individuals that do not form clusters may represent remnants of an earlier Southeast Asian dispersal that arrived in the Himalayas before the Neolithic period. Evidence of an early migration is provided by archaeological sites at Xiao Qaidam in the Central Qaidam basin and Chusang near Lhasa, dating to ∼33.0±4.4 KYA and 21.7±2.1 KYA, respectively, which points to a late Paleolithic entry and occupation of the plateau by modern humans.1
The presence of haplogroups N and Q in Tibet, haplogroups which are commonly found in Siberia and Mongolia,34,36,37,56–58 also reflect gene flow from the north and/or east. A genetic affinity between Tibet and the northern populations of Siberia and Mongolia is consistent with previous mtDNA studies.14 Recently, an extensive phylogeographic survey59 of haplogroup N-M231 indicated that this marker originated in Southeast Asia but failed to spread into South Asia.22,23 Our results are consistent with that interpretation.
Central Asian Signature in the Himalayas
Haplogroup D lineages have been reported at high frequencies in both Japan and in previous studies of Tibet.3,4,15,35,60 These two populations exhibit contrasting distribution of D-M174 subclades. Tibet is characterized by D1-M15, whereas Japan is represented by haplogroup D2-M55.3,4,35 In addition to D1-M15, the Tibetan population also possesses high frequencies of D3-P4735 but displays relatively low proportions of D*-M174.8 The latter is reported in high proportion in Jarawa and Onge males from the Andaman Islands, consistent with an early migration via the southern coastal route.61–63
In the present study, 50.6% of the Tibetan Y chromosomes are represented by haplogroup D lineages, which, in descending frequencies, are D1-M15 (28.2%), D3-P47 (18.6%), and D*-M174 (3.8%). The close genetic affinity portrayed between the Tibetans and Japanese in the CA plot (fig. 4) may suggest an ancient common ancestry, in agreement with earlier observations.35,64,65 However, unlike the Paleolithic founder for the Japanese DE-YAP chromosomes, the DE-YAP haplogroup in Tibet appears Neolithic in origin. Previous studies of Y-chromosome polymorphisms in Tibet argue for a Central Asian contribution to account for the haplogroup DE-YAP chromosomes in their gene pool.3,4,16 Divergence time estimations for D1-M15 (5.1±1.8 KYA) and D3-P47 (11.3±3.7 KYA) in Tibet suggest a bipartite penetration, on different occasions, into the Tibetan plateau. Nevertheless, the ancient components of these two lineages that do not form clusters with either 7 (D3-P47) or 11 (D1-M15) DYS392 repeats were found to possess ages of 13.8±5.0 KYA and 27.3±9.8 KYA, respectively, corroborating the earlier report of older shared ancestry between the Japanese and Tibetan populations.
On the basis of Y-chromosome and linguistic analyses, Su et al.4 proposed that the Baric people, a branch of the Tibeto-Burman language family, were the first inhabitants of the Himalayan region, whereas Tibetans, who belong to the Bodic branch, were recent immigrants who arrived subsequent to admixture with Central Asians. Complete absence of haplogroup DE-YAP in Tamang, Newar, and Kathmandu adds support to the peopling of Nepal and Tibeto-Burman northeastern India by the Baric group from East Asia, who were shown to lack the haplogroup DE-YAP.4,20
Gene Flow from India
The distribution of haplogroup R1a1-M198 has been suggested elsewhere to be associated with the spread of the Kurgan culture from the Central Asian Steppes into northern Europe.33 This haplogroup is also found at high frequencies in Central Asia26,66 and South Asia18,23,67 but is rarely observed in East Asia.3 Quintana-Murci et al.67 attributed the high proportion of R1a1-M198 in India to a Neolithic demic diffusion of Indo-European speakers from southwestern Asia. However, a recent study revealed an early Holocene expansion of R1a1-M198 from the northwestern region of India before the arrival of the Indo-European speakers.23 Although a much earlier entry of R1a1-M198 into India has been postulated on the basis of its high microsatellite variance and estimated age (14 KYA), multiple events resulting from subsequent migrations from southwestern Asia may have also contributed.23
In the present study, haplogroup R1a1-M198 is found at low frequencies in Tibetan (1.9%) and Tamang (4.4%) populations, whereas higher percentages are observed in the Kathmandu (35.1%) and Newar (25.7%) populations, implying significant gene flow from the Indian subcontinent for the latter two populations, particularly from northern India, where its presence is high (40%).18 The CA also suggest strong Indian influence, since Kathmandu and Newar group closer to the Indian populations (fig. 4). In addition, the CA graph (fig. 4) displays close genetic association of Central Asian populations (Kyrgyz and Karalkapak) with Newar and Kathmandu groups, in agreement with the admixture results (56.6% and 51.4%, respectively). Microsatellite-based age estimates for haplogroup R1a1-M198 in Kathmandu (4.8±2.1 KYA) and Newar (1.0±1.0 KYA) are considerably younger than in India (14 KYA).23 Taken together, these results indicate that R1a1-M198 most likely arrived in Nepal from Central Asia and/or northern India. When the obstacle presented by the Himalayas is considered, it is possible that the penetration of R1a1-M198 into Nepal occurred indirectly from Central Asia, possibly as Indo-Aryans migrated north from India ∼3–3.5 KYA.
With the exception of the Newar, in whom R2-M124 accounts for 25.7% of Y chromosomes, the other Nepalese populations and Tibetans exhibit lower frequencies similar to those observed in India (<10%).21,23 Furthermore, Newar is the only population that exhibits R1b1a-M269 (10.6%), despite the low levels found throughout South Asia.23 The relatively high proportion of these two markers in the Newar may be the result of genetic drift.
Previous studies have shown that haplogroup H*-M69 and its subclades H1-M52 and H2-APT originated within India.18,23,68 Haplogroup H1-M52 is found at higher frequencies in southern India,26,69 with Punjab in the northwest displaying the lowest frequency.18 In the present study, H1-M52 occurs in the Newar and Kathmandu populations at 6.1% and 11.7%, respectively. The indigenous Indian marker H*-M69 is also found at a low frequency (1.9%) in the Tibetan gene pool, which may represent a limited gene flow from India. In historical times, the spread of Buddhism from India to Tibet may account for the presence of signature Indian chromosomes in the region.
With further molecular resolution achieved within haplogroup C in the form of mutation C5-M356, a portion of the previously classified C* (xC3-M217) paragroup are now characterized as representatives of this newly defined marker.23 Its occurrence in Newar (2.6%) is double the frequency observed in Kathmandu (1.3%) and India (1.4%)23 and most likely reflects gene flow from India into Nepal. Whereas haplogroups C3-M217 and C* (xC3-M217) are prevalent in East Asia and Central Asia, 26,34,64,65,70 the former has not yet been detected in India. Therefore, the presence of C3-M217 in Kathmandu (2.6%) and Tibet (2.6%) is most likely attributable to gene flow from Mongolia and/or Siberia; the marker is found at its highest frequencies (55.0%) in Mongolia.26,70
The Himalayas: Disproportional Barrier of Gene Flow
The Himalayas stand between the Tibetan plateau and India, creating a formidable obstruction to population movements, as reflected in the Y-haplogroup composition of Tibet, where haplogroup R lineages, a signature of Indian penetration, are negligible. Similarly, haplogroup L-M20 did not penetrate upward from India. Despite this obstacle, since the 7th century a.d., India has exercised a strong influence on Tibet, especially with respect to the spread of Buddhism. However, historical records document the spread of Buddhism from northwestern India to Pakistan and Afghanistan and, later, into East Asia along the northern Silk Road. In fact, the younger time estimates calculated for the D-M174 chromosome derivative D1-M15 (2.3±1.0 KYA) may be the result of gene flow connected to Silk Road migrations from Central Asia.
Conversely, Su et al.4 proposed that the peopling of Nepal by the Baric-speaking group(s) occurred when they crossed the Himalayas from the Tibetan plateau. This scenario is consistent with the extensive sharing of haplogroup-O lineages among the Tibeto-Burman speakers in the region. The results of the present study suggest that population movements across the Himalayas from Tibet to Nepal have occurred periodically, whereas dispersals in the opposite direction from India into the Tibetan plateau have been limited. The observed preferential gene flow southward could be accounted for, at least in part, by the unfavorable conditions experienced on the Tibetan side of this vast mountain range, where higher altitudes, colder climates, and physiological stress would have discouraged human migrations to and settlements in the region.
Conclusion
In conclusion, the array of data presented supports the Tibeto-Burman affinities of the Himalayan populations examined in the present study. Whereas Tibet and Nepal share high levels of haplogroup O3a5-M134 (especially O3a5a-M117), commonly found in East Asia, they display an absence of Southeast Asia–specific markers (O1-M119 and O2-M268). In combination with the relatively younger age generated for O3a5-M134, East and Central Asian ancestry for the Himalayan groups is likely. However, subsequent gene flow from the Indian subcontinent into Nepal is signaled by the presence of haplogroups R and H, as well as from the results of the admixture analysis, in which Indian contribution is >40% in both Newar and Kathmandu. In contrast, Indian influence in Tamang is null.
The Tibetan gene pool reflects significant contributions from East and/or Southeast Asia. The elevated presence of haplogroup D-M174 chromosomes in Tibet and Japan indicates prolonged geographic isolation of these regions, since it is sparsely distributed in the rest of East Asia. The age of Tibetan D1-M15 and D2-M55 lineages are considerably younger than the Paleolithic Japanese D2-M55, indicative of a recent bottleneck. Haplogroups Q-M242, C-M216, and N-M231 in Himalayan populations point to gene flow from Mongolia and Siberia, underscoring the geographic accessibility of the Tibetan plateau for human dispersals from the north. The Himalayas served as a strong barrier to gene flow from the south into the Tibetan plateau, although the same is not true for population movements occurring in the opposite direction. This preferential migrational direction may be associated with the physiological stress imposed on emigrants from lower altitudes.
Acknowledgments
We gratefully acknowledge Laisel Martinez, Sheyla Mirabal, Charlie Haz, Jason Somarelli, and Tanya Simms for their technical assistance. We thank all sample donors and those who helped with the sample collection. T.G. is a recipient of the Tibetan Fulbright scholarship administered by the Tibet Fund, NY.
References
National Center for Biotechnology Information, U.S. National Library of Medicine 8600 Rockville Pike, Bethesda MD, 20894 USA
posted by Unknown @ 9:41 PM 1 Comments